# How do I remove many individual side text frames/boxes on many pages on the left edge?

Hi!

I am on LO Writer 6.1.5.2.

I scanned a lot of pages (~500) with OCR into an Docx document. The perforation (big wholes on the left side; example ) were interpreted as text and were putted into text frames/boxes (~12 per side x 500).

Here are some example pages: https://imgur.com/Vnhkje0

Is it possible to remove these text frames not manually? Maybe by cutting off a certain area/edge of all pages?

Thanks in advance for any help.

edit retag close merge delete

Without macro I can't see the possibility of mass deletion of frames. I could not find a proceeding on the Navigator which could do. So I propose another strategy. See answer.

( 2019-05-11 21:51:02 +0200 )edit

If the artefacts actually are frames or a specific type of shapes which otherwise does not occur in the document, It's simple to remove them by user code.
When scanning sheets having systematic defects of the given kind, it's adisable to cover these defects with a (folded) strip of white paper.

( 2019-05-11 23:29:46 +0200 )edit

If artefacts of the given kind actually are text frames or shapes of a specific kind, otherwise not occurring in the document, it's easy to remove them by user code, because they are listed in either the .TextFrames property or as elements of the .DrawPage .

( 2019-05-11 23:36:59 +0200 )edit

Thank you for this hint. Do you see a possibility to maybe "reduce" down to/by their position?

( 2019-05-12 00:25:57 +0200 )edit

Sort by » oldest newest most voted

(In Reply on the comment by the OQ answering my second comment on his question:)
TextFrames and Shapes know a lot on what you can base a decision whether or not they should be disposed. There is a .Anchor e.g. which is a TextRange object. As any objects in LibO text documents, they dont know about the page they are placed in. You may play with this simple example.

more

Do you may know where in the file these information are? Would you may take a look into the original docx file? https://send.firefox.com/download/ba5... Password: Heaven123!

Please tell me what you may can identify.

( 2019-05-12 01:50:02 +0200 )edit

I checked your DOCX file. You have to save it as an ODT file. Then rename extension to ZIP. You can open it as a ZIP file. Work on CONTENT.XML. You find several "tags" beginning with <draw:text-box> - okay, delete them. Godspeed.

.

cover these defects with a (folded) strip of white paper. (Lupp)


+1

.

Your undertaking is interesting but has to be planned from the beginning with lots of details. IMHO your DOCX files are the wrong way for effective work.

.

By the way: Why don't you ask the question in the German branch of this website: https://ask.libreoffice.org/de - Lupp and I seem to be bloody Germans as well as you are one ;-) .

( 2019-05-12 10:28:50 +0200 )edit

@Grantler concerning "German branch": Yes, it's annoying to need to do all these foreign-language-handsprings even if all the (contributing) participants of a thread are Germans. On the other hand: The German branch has had many near-death-experiences already, and may be reserved to those few Germans, Autrichiens, de-CH friends, actually not being capable of discussing topics in English at all. The LibO project meanwhile is counting more than 200 locales. Valuable communities for every single one?
And: Labour invested into an English answer has a much larger scope of potential silent beneficiaries.
And Westerners should be interested in having (and still developing more) a common Lingua Franca using Latin letters. The next global alternative neither will be german nor русский nor català nor suomalainen. Guess ...(more)

( 2019-05-12 11:11:38 +0200 )edit

OT - @Lupp - your reply concerning Lingua Franca makes sense, thanks a lot.

( 2019-05-12 11:29:17 +0200 )edit

Scan your text into tif/png images, then you can cut the left margin in batch mode, in one step for many images. Probably XnView can do.

.

For better OCR let the images rip online by www.pdf24.org or OCR (non free) apps like Abbyy finereader or Iris. They do not set your text into frames but generate plain text, possibly preserve some text formats. Good OCR preserves paragraphs and does not set carriage returns / line feeds at the end of each text line. - It is inevitable to work on your ripped text for satisfactory finishing.

.

Right hand click on screenshot > show for better view.

more