Why does LibreOffice increases a lot a file size of a converted pdf file from .doc file using LibreOffice command line?

I’m converting a .doc file (2 Mb) to a pdf file using libreoffice command line, and the pdf converted had increased it’s size around 3 times, reaching almost 7 Mb.

I noticed that the original .doc file has 6 pages and some images inside. Verifying main.xcd configuration file, the property “UseLosslessCompression” is setted to true. OK, because I need that all images pixels be preserved, but why the pdf converted file had increased it’s size so much in that way?

The presented information is insufficient to definitively answer the question: it could happen to be some actual bug. But most possibly that it all works correct, and you simply need to learn the difference between lossless and lossy compression algorithms, and the size difference they might offer when you take a lossy-compressed image and try to re-compress into a lossless format. The 3-times difference is not the worst outcome in these conditions.

Again: only guessing without seeing the actual data. But you may try to experiment yourself: open the .DOC and save it to .ODT or .DOCX - both are actually just .ZIP archives with XML and binary files inside. As LibreOffice makes sure that in this conversion, it doesn’t change the image bytes, you will find the images inside in their original form (most likely JPEG). Try to convert them to PNGs using your favorite raster image editor, and see the resulting size. Depending on editor, your results would slightly vary, but not much.

Exemplo Arquivo Aumento Tamanho Conversao.doc

Hello Mike,

Thanks for your answer.

Could you have a look to the attachment file and try to convert to pdf, please? I made an example, which reaches 743 Kb after conversion (original size: 79 Kb).

Thanks a lot.

Daniel.

I tried it. And it is just what I mentioned above.

The original file contains two JPEG-compressed images, having size 28 672 bytes and 24 576 bytes - it is clear if you resave the original to ODT and open the latter as ZIP. After exporting the original document to PDF (with lossless compression and without downsampling), and then opening the resulting PDF in LibreOffice (Draw), then saving as ODG and opening the latter as ZIP, you may see there two PNG images: in my testing, one is 236 718 bytes, other is 215 987 bytes (exact numbers might vary from version to version, but not much). You see that the size of images has increased 8+ times! And that’s PNGs stored in ODG; PNG stores images in own (similar) format, but it looks like it’s a little less compressed.

So - unless you disable the lossless compression option, it’s normal to see such increase.

Hello @DanielIngrácio, I did the following test, follow in the image:

image description

  1. your doc file
  2. converted to odt
  3. odt with compressed image
  4. doc to pdf
  5. odt to pdf
  6. compressed odt for pdf

Compare file sizes.

Tip before converting to PDF compress images (with the selected image click direct click Calculate new size OK, if image distortion use return)


ATTENTION: If you would like to give more details to your question, use edit in question or add a comment below. Thank you.

If the answer met your need, please click on the ball Descrição da imagem to the left of the answer, to finish the question.