Hi everyone,
we use libreoffice command line to transform several kind of document types into PDF.
When we convert DOCX files, which contain images, to PDF files and then open the PDF file programmatically, e.g. with Python libraries, the matching between pages and images is messed up.
Concrete example:
1 document with 3 pages and 3 images. Every page has one image. When we transform the doc with MS Word to PDF and open it programmatically, we get the exact matching, meaning {page1: image1, page2: image2, page3: image3}.
When doing the same with libreoffice we get {page1: [image1, image2, image3], page2: [image1, image2, image3], page3: [image1, image2, image3]}
Did anyone experience a similar situation and has a potential solution for it?
thanks and best regards
Christoph