I am running the standard soffice --headless --convert-to png --outdir dirname sourcedir/*
, but around the 250 document mark it is simply halting without error. What am I missing? For context sourcedir contains .docx files only. I monitored the soffice process to see if it’s a memory thing and it’s not.
A few questions. Answers may help you to find the solution. If not, they are likely to help us to help you.
- What system are you on? MacOS, Unix, linux, or something else?
forward slash in your source path suggests that it is not MS Windows - Can you identify the file on which the process halts?
- Is the filename particularly long, containing punctuation or other special characters, or very similar to an other (already processed) filename?
- Is it an empty file (which would not provide any input to create an image)?
I am on Ubuntu 22.04. I can not verify which file is breaking the process because it’s not processing the directory sequentially, so I can’t tell which document was being processed the instance the process halts. I have regenerated the documents a handful of times so I can’t confirm if it’s an individual document that is the issue. However I just discovered that on a different directory as well the process is halting exactly at the 250 mark. Could it be an internal limit, and if so can it be changed through a flag perhaps?
Exactly 250 or exactly 256?
Either way it could be a limitation in the shell environment, if the file set is built there first and then sent to the command.
Try looping in the shell instead of serving the wildcard to soffice.
for i in sourcedir/*; do soffice --headless --convert-to png --outdir dirname $i; done
This should pick one file at a time, I think, and display the command with the name of each file as it is converted. Can’t test it right now (no linux shell handy) so I may err on the syntax. Try it and see! If it doesn’t help, it should at least identify the offending file.
I’d rather suspect an environment limit: e.g., the wildcard expansion limit in Bash.
You could try to use some system tool to check the exact command line that invoked the process (I think that it would have the list of files after the expansion; my idea is that it would already be limited to that number).
In addition to @keme1’s suggestion, to speed the loop up, avoiding a significant overhead launching/shutting down the whole process, you can launch LibreOffice separately before running the loop; the whole processing will happen in that initially launched instance, so without additional overhead.
Thank you both for your answers. Repeatedly calling soffice processes to life will take me very long since I am working with 100k+ documents, running multiple profiles in parallel.
However I want to ask you if it’s possible to call a --convert-to through a macro. I’ve had great success looping over huge directories within macro, so I believe I can do the same for conversion.
edit: the much simpler solution of course was to reorganize my documents into subdirectories 200 strong, and running * on each of them.