Ask Your Question
0

How do I remove many individual side text frames/boxes on many pages on the left edge?

asked 2019-05-11 12:59:13 +0200

EigentlichWizard gravatar image

updated 2019-05-11 13:00:58 +0200

Hi!

I am on LO Writer 6.1.5.2.

I scanned a lot of pages (~500) with OCR into an Docx document. The perforation (big wholes on the left side; example ) were interpreted as text and were putted into text frames/boxes (~12 per side x 500).

Here are some example pages: https://imgur.com/Vnhkje0

Is it possible to remove these text frames not manually? Maybe by cutting off a certain area/edge of all pages?

Thanks in advance for any help.

edit retag flag offensive close merge delete

Comments

Without macro I can't see the possibility of mass deletion of frames. I could not find a proceeding on the Navigator which could do. So I propose another strategy. See answer.

Grantler gravatar imageGrantler ( 2019-05-11 21:51:02 +0200 )edit

If the artefacts actually are frames or a specific type of shapes which otherwise does not occur in the document, It's simple to remove them by user code.
When scanning sheets having systematic defects of the given kind, it's adisable to cover these defects with a (folded) strip of white paper.

Lupp gravatar imageLupp ( 2019-05-11 23:29:46 +0200 )edit

If artefacts of the given kind actually are text frames or shapes of a specific kind, otherwise not occurring in the document, it's easy to remove them by user code, because they are listed in either the .TextFrames property or as elements of the .DrawPage .

Lupp gravatar imageLupp ( 2019-05-11 23:36:59 +0200 )edit

Thank you for this hint. Do you see a possibility to maybe "reduce" down to/by their position?

EigentlichWizard gravatar imageEigentlichWizard ( 2019-05-12 00:25:57 +0200 )edit

2 Answers

Sort by » oldest newest most voted
0

answered 2019-05-12 00:41:04 +0200

Lupp gravatar image

updated 2019-05-12 00:45:46 +0200

(In Reply on the comment by the OQ answering my second comment on his question:)
TextFrames and Shapes know a lot on what you can base a decision whether or not they should be disposed. There is a .Anchor e.g. which is a TextRange object. As any objects in LibO text documents, they dont know about the page they are placed in. You may play with this simple example.

edit flag offensive delete link more

Comments

Do you may know where in the file these information are? Would you may take a look into the original docx file? https://send.firefox.com/download/ba5... Password: Heaven123!

Please tell me what you may can identify.

EigentlichWizard gravatar imageEigentlichWizard ( 2019-05-12 01:50:02 +0200 )edit

I checked your DOCX file. You have to save it as an ODT file. Then rename extension to ZIP. You can open it as a ZIP file. Work on CONTENT.XML. You find several "tags" beginning with <draw:text-box> - okay, delete them. Godspeed.

.

cover these defects with a (folded) strip of white paper. (Lupp)

+1

.

Your undertaking is interesting but has to be planned from the beginning with lots of details. IMHO your DOCX files are the wrong way for effective work.

.

By the way: Why don't you ask the question in the German branch of this website: https://ask.libreoffice.org/de - Lupp and I seem to be bloody Germans as well as you are one ;-) .

Grantler gravatar imageGrantler ( 2019-05-12 10:28:50 +0200 )edit

@EigentlichWizard: I just tried to get the file you had uploaded somewhere and linked to in your comment above, but the link had expired already.

@Grantler concerning "German branch": Yes, it's annoying to need to do all these foreign-language-handsprings even if all the (contributing) participants of a thread are Germans. On the other hand: The German branch has had many near-death-experiences already, and may be reserved to those few Germans, Autrichiens, de-CH friends, actually not being capable of discussing topics in English at all. The LibO project meanwhile is counting more than 200 locales. Valuable communities for every single one?
And: Labour invested into an English answer has a much larger scope of potential silent beneficiaries.
And Westerners should be interested in having (and still developing more) a common Lingua Franca using Latin letters. The next global alternative neither will be german nor русский nor català nor suomalainen. Guess ...(more)

Lupp gravatar imageLupp ( 2019-05-12 11:11:38 +0200 )edit

OT - @Lupp - your reply concerning Lingua Franca makes sense, thanks a lot.

Grantler gravatar imageGrantler ( 2019-05-12 11:29:17 +0200 )edit
0

answered 2019-05-11 21:42:20 +0200

Grantler gravatar image

updated 2019-05-11 21:46:38 +0200

Scan your text into tif/png images, then you can cut the left margin in batch mode, in one step for many images. Probably XnView can do.

.

For better OCR let the images rip online by www.pdf24.org or OCR (non free) apps like Abbyy finereader or Iris. They do not set your text into frames but generate plain text, possibly preserve some text formats. Good OCR preserves paragraphs and does not set carriage returns / line feeds at the end of each text line. - It is inevitable to work on your ripped text for satisfactory finishing.

.

Right hand click on screenshot > show for better view.

image description

edit flag offensive delete link more

Comments

Wow! This could work! Thank you so much for your answer! I will try it and will come back with a result. God sent yo! It is a shame I don't have enough credit to upvote your answer. Will do it asap if I am able to do it! One question: Where do I find the "Stapelverarbeitung" and how can I select more than one grafic in Writer?

EigentlichWizard gravatar imageEigentlichWizard ( 2019-05-11 22:07:52 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-05-11 12:59:13 +0200

Seen: 41 times

Last updated: May 12