Ask Your Question
0

Edit Searchable PDF in Draw [closed]

asked 2014-11-22 10:22:50 +0200

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Hi,

I was locking for a tool to edit the poorly recognized PDF I scanned in. I scanned it with a OCR tool which didn't allow me to to a proper review of the recognized text. So I was locking for a tool where I can change the recognized text in that document and save it again as a searchable PDF with the Image as background and the overlaying text "invisible".

Draw can import those PDF files and allows me to edit the corresponding text fields. Now I have two problems:

  1. The font is not available.

    This is not a problem because the fallback font seems to fit way better. The original font is "Time New Roman" which is not availabe but shown in the font selector. Hovering the font selector it blends in the mouseover-text "Font Name. The current font is not available and will be substituted." but I can't figure out which text it uses as fallback but I'd like to use it for all the text in this document.

    Sow how can I see which Font is actually in use?

  2. Export as searchable PDF again

    Saving this document as a searchable PDF again results in something different that I want. The recognized text in the text fields is shown above the image text. I'd like to have it like a real searchable PDF.

Any suggestions?

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2016-03-06 17:18:02.205243

1 Answer

Sort by » oldest newest most voted
0

answered 2014-11-22 12:50:24 +0200

ROSt52 gravatar image

I think, you are able to do this work with LibO and export to pdf at the end. A file exported from LibO to pdf can be searched for strings.

What your OCR can do I cannot judge but I expect it can create a kind of text file from which you can copy & paste text into LibO.

As you want to have images in the background, Draw appears to be a choice do to different layers you can use. one layer for images and another layer for text. Thus, you only need to copy&paste the OCR text into text boxes of Draw and make the text correction and your texts are done. Text and images can be moved around the page independently because they are in different layers.

Saving I would do in odg-fromat (native format of Draw) and export the file thereafter to pdf. By doing so you always can modify your file easily in Draw and create with a click of a button an new version in pdf-format.

As for font recognition there are homepages, which can help you to identify fonts. I used once: http://www.identifont.com/similar?2D8

edit flag offensive delete link more

Comments

The idea works, but LO has bugs and faults at understanding certain characters, as demostrated in https://bugs.freedesktop.org/show_bug...

rautamiekka gravatar imagerautamiekka ( 2014-11-22 13:04:57 +0200 )edit

Thanks. But that doesn't really apply to my question. LibO takes a another font for the one it does not have. I want to know whicht one it is in that case but I cant figure out because it shows onle the name of the font from the file wich is not available.

bitnapper gravatar imagebitnapper ( 2014-11-22 17:53:39 +0200 )edit

"...because it shows onle the name of the font from the file wich is not available." this problem exist in all text SW. Therefore I provided you the link for font identification.

ROSt52 gravatar imageROSt52 ( 2014-11-24 13:28:58 +0200 )edit

I know. But under this link it identifies (or at least tries to) the font used in the original document but not the one LibO chose instead. And that all text SW does behave like this, doesn't make it better.

bitnapper gravatar imagebitnapper ( 2014-12-06 12:57:28 +0200 )edit

Question Tools

1 follower

Stats

Asked: 2014-11-22 10:22:50 +0200

Seen: 1,031 times

Last updated: Nov 22 '14