Automate Search and Replace LibreOffice Writer with altsearch.oxt

I am hoping that someone can help me with a way of semi-automating a task I run at least fifty times a day, usually many more than that.

I use LibreOffice Writer with the add-on altsearch.oxt which is a very excellent tool for searching and replacing within text.

I receive documents from many different sources and they are each formatted differently … and usually wrong for our purposes. Some are emailed and have a line length of 72 with a paragraph break at the end of each line, some have indents at the beginnings of paragraphs, some indents will be tabs, some will be a number of space bar presses, some will be random and unequal spare bar presses. Some have random double spaces through-out the document. Some use line breaks instead of paragraph breaks … they are all different and usually wrong for what we need, or even what we ask for.

We have asked for and insisted upon standard (for us) formatting, but it is not likely to happen. Realistically, I have to correct each submission.

So, in LibreOffice Writer, I save each submission as an ascii text document. I then search for and replace a series of conditions and clean each document so that it is a standard that we can work from.

As an example:

Step 1. I search for double paragraph breaks and replace them with a nonsense series of characters unlikey to be found in the document.
Replace “\p\p” with “XXXXXX”

Step 2. I then search for single paragraph breaks at the end of each line and replace them with a space.
Replace “\p” with " "

Step 3. I then search for any double spaces and replaces them with a single space. I often have to do this search and replace function several times repeatedly to remove all of the double spacing. Occassionally a submission with have 5 or 10 spaces marking the paragrah indent and I continue to do this until the result is 0 replaced.

Step 4. I the Replace “XXXXXX " with “XXXXXX” to remove any leading spaces from a paragraph. Likewise I will Replace " XXXXXX” to eliminate and trailing spaces at the end of a paragraph.

Step 5. I will then Replace “XXXXXXXXXXXX” with “XXXXXX” to eliminate any extraneous paragrah breaks. This, too may need to be repeated several times until the result is 0 replaced.

Step 6. Finally, I will restore the document. Replace “XXXXXX” with “\p\p”

I now have a plain text document which is exactly formatted the way we need it to begin to process it. These steps each need to be performed in the same order and several of them (Step 3 and Step 5 in this example) may have to run several times.

Is there a way I can run this as an automated process automatically repeating Steps 3 and 5 if needed?

Replace “\p\p” with “XXXXXX”
Replace “\p” with " "
Replace " " with " " [may need to be repeated many times]
Replace “xxxxxx " with “XXXXXX”
Replace " XXXXXX” with “XXXXXX”
Replace “XXXXXXXXXXXX” with “XXXXXX” [may need to be repeated several times]
Replace “XXXXXX” with “\p\p”

I think the extension has the option for batch process.

Click on [?] to find out the help about how to use the batch.

image description

It does have a batch process, but there isn’t an obvious way for it to repeat a action if needed and to continue to the next action if it is not needed. Thank you, I will take a closer look at this.

I would try to write a macro. To do so, I would to have a deep look at the free guides, which you can download from http://www.libreoffice.org/get-help/documentation/ ; especially into Writer macros.

I never even thought Macro … That is probably exactly the way to do this. It’ll probably be some trial and error, but I will look at the doc and see if I can make it work. I think this is the best way to do it. Thank you.