Converting page by page for command line conversion

Hi, I’ve been using LibreOffice on ubuntu to convert, using command-line, docx/xlsx/pptx to pdf. However the machine I plan to deploy it on has very restricted memory (it could probably give ~500mb max to libreoffice). This means I can’t really work on larger files (tested a ~330mb pptx file that ended up using 760mb of memory during conversion).

I was wondering if it is possible to instead convert page by page or specify a page/page range from command-line (into separate pdfs if need be) and if that would still require LibreOffice to load the full file or whether I could gain some memory savings.

I’m also open to any other suggestions to reduce memory usage or whether Libreoffice has any disk caching of some sort such that it wont be limitted (albeit slower) by the memory limit.

1 Like

Use another machine witl less restrictions Windows, MS Office and a virtual PDF printer.

In to-be-released-this-summer 7.4:

https://vmiklos.hu/blog/pdf-convert-to.html

2 Likes

Thank you for this solution, yet I’m not using Collabora or Online, only the basic libreoffice CLI. Do you know if there are any plans to release it there as well ?

I’ve tried the following command on a 3-page-long document: libreoffice --headless --convert-to 'pdf:draw_pdf_Export:{"PageRange":{"type":"string","value":"2-"}}' myfile.docx --outdir . as well as "value": "1-1"
For both, the entire document was converted instead of only page 2 or page 1.

Why do you use draw_pdf_Export with a DOCX? It should be writer_pdf_Export.
And what version of LO and on what OS do you use?

Thanks for the tip, actually I didn’t find the complete documentation page so I copy-pasted from the example on your link . Using the writer_pdf_Export filter still gives me the same result.

I’m using Ubuntu 22.04 and libreoffice --version returns LibreOffice 7.3.7.2 30(Build:2). Since I installed it recently (working in a docker - apt install libreoffice), I thought I would get the latest version. Sorry this is not related to the original post, but do you know how I can upgrade it ? I’ve tried apt update && apt install --upgrade libreoffice, but I get libreoffice is already the newest version (1:7.3.7-0ubuntu0.22.04.3)

Just followed the instructions at How To Install LibreOffice on Ubuntu 22.04 – TecAdmin + apt install software-properties-common to get add-apt-repository, now I have libreoffice 7.6 and your solution is working !

Thank you for your quick reply btw, and again for this nice solution !

I’d be interested if you know where the JSON syntax is documented, maybe there are other features that could be useful to my project.

It is documented in help.

1 Like