# Preserve text and vector images when printing to file [closed]

I've got a rather roundabout process (in LO 4.1 but used since 3.x) to have EPS images be preserved as vector images in a PDF: Print to PS file and then convert to PDF using ps2pdf -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode -- the regular PDF-export cannot be convinced not to convert them to nicely anti-aliased raster images (whose resolution I have no control over) that of course look crappy in print.

So far so good, but now (with printing already underway) I realized that text in the PDFs (already in the PS files) has been turned into vector objects and can neither be marked and copied nor searched for. This is only true for some fonts (Bitstream Vera becomes vector drawings, Linux Biolinum and Helvetia don't!). Sadly, I can't change the font types now as that would mess up the layout.

=> Does anyone know a way to keep EPS images as vector objects, PNG images as uncompressed bitmaps and characters as characters? Or is there a way to add the lost information to the PDF after the fact? Some sort of OCR process that works on vector images instead of scanned bitmaps?

edit retag reopen merge delete

### Closed for the following reason the question is answered, right answer was accepted by Alex Kemp close date 2016-02-23 21:35:24.785214

Sort by » oldest newest most voted

As I am of the impression that you work on LInux, I searched the web with "free OCR linux" and got several hits. This one drew my attention: https://en.wikipedia.org/wiki/Tessera... because it can even Japanese. But there are many more.

The ABBYY Finereader @mahfiaz recommended works very nice. I don't have (yet) experience with Tesseract's or other OCR SW.

more

Tesseract is a back-end for OCR software, and it seems to be made to recognize characters in raster images. One software that builds on Tesseract is gOCR, but that requires images and will produce text files -- I have no images but a PDF file with glyphs, and I don't want a text file but a new PDF file with proper (searchable) characters.

( 2014-05-31 16:24:50 +0200 )edit

ABBYY Finereader would do that really well if you have spare bucks. Adobe Acrobat Professional would do that somewhat okay (also costs money). But please write a bug report and ask for improvements on this front.

more