How to convert pdf to doc using python and libre office?

asked 2019-06-25 17:54:31 +0200

tanvi412 gravatar image

conversion using python

There is no method to convert PDF to .doc in LO.

Ratslinger gravatar imageRatslinger ( 2019-06-25 18:01:15 +0200 )edit

Is there any other way to do?

tanvi412 gravatar imagetanvi412 ( 2019-06-25 18:05:44 +0200 )edit

Possibly other tools on internet.

Ratslinger gravatar imageRatslinger ( 2019-06-25 18:14:06 +0200 )edit

1 Answer

answered 2019-06-25 19:03:22 +0200

Lupp gravatar image

updated 2019-06-25 19:05:57 +0200

(Python not addressed here!)

Probably OCR with export to (MSO .doc?). LibO .odt? Or a full-grown pdf software which LibO surely isn't.

A pdf not containing an embedded document ("hybrid pdf") will be opened in 'Draw' by LibO, and this may give access to texdtual content - or only to images.

If you only want to get unformatted text content for further manipulation/editing, the "samrt" ways of many OCR to receate formatting may not be welcome. You can try then the solution from the attachment to my answer here (inserted by editing there). Please note that there a sort coded in Basic is involved which not is optimised for efficiency.

As a Python user you surely know how to use the better standard functions available there for the respective tasks. Code using the API will need a different rewrite.

