Soffice.bin on Windows command line

LibreOffice v4.0.4.2

Windows 7 Business

I have a series of questions about converting documents to PDF with LibreOffice on the command line.

  1. Can LibreOffice convert a multi page tiff image into a multi page PDF file?

    (edit: No it cannot at this time. Use GhostScript instead.)

I am using the following command:
soffice.bin -headless -convert-to pdf:writer_pdf_Export -outdir files/ files/multi_page.tif

The About Import and Export Filters (help.libreoffice\.org/Common/About_Import_and_Export_Filters) page seems to suggest that it can, but I am unable to make it work. The page states "Multipage-TIFFs are allowed when graphics are imported or exported in TIFF format". However, when I try to convert the tiff file to pdf only page 1 is saved. All subsequent pages in the tiff are dropped.

  1. Can soffice.bin report an error code and error message for a failed command?

    (edit: This has been reported as a bug: fdo#67009)

Given the following valid command with an invalid input file:
soffice.bin -headless -convert-to pdf:writer_pdf_Export -outdir files/ files/I_do_not_exist.xls

LibreOffice silently fails with no indication that anything went wrong. No warning/error diagnostic is printed to STDOUT or STDERR, and the Windows **%ERRORLEVEL%** variable remains unchanged. Is there an option to enable verbose error reporting for soffice? Is there a better way of doing this on Windows? Am I going about this the wrong way?

  1. What is the best way to pass a large number of input files without exceeding the command line length?

    (edit: Use file globbing and send batches of files to soffice.bin)

I am batch converting many thousands of files and will certainly run afoul of CMD.exe's 8192 character command line length limit if I tried to list all the files out.

From what I can tell I only have the following two options available:
**Call soffice.bin with each individual file one-by-one** (easy to implement, but very slow since LibreOffice will startup and shut down for each file)
**Call soffice.bin with batches of files** (faster than calling it for each individual file, but requires more calculations to ensure the command line character limit is not exceeded).

Are there any other ways to pass input files to soffice.bin?
For example, GhostScript allows you to specify a name of a file with '@' appended to the beginning, '@files.txt', that has all the input files listed on separate lines. Can LibreOffice do something similar?

  1. Can LibreOffice append PDFs it creates to an existing PDF file?

    (edit: No it cannot at this time. Use GhostScript instead.)

The whole *raison d'etre* for all this is to take a bunch of dissimilar files (Excel workbooks, JPEG images, text files, TIFF images, Word documents, RTF files) and combine them all together into a single PDF.

I can convert the files into single PDFs just fine (with the exception of multipage TIFF images as mentioned above), but I was wondering if there was a way to merge them into a single PDF instead. Can LibreOffice do this, or should I leave that task up to a dedicated program like GhostScript?

Thank you for your time.

I will attempt to address each of your questions in turn.

A1. The quote from the referenced help page is in a context relating to Draw / Impress rather than the conversion of graphics to PDF. While this may not matter at a low level it is also worth noting that the help pages are not always as accurate as they could be. To complicate your particular case, there are known differences (e.g., fdo#40186, fdo#42871, and fdo#46026 to indicate just a few) between the headless and UI forms of filters.

Example multi-page TIFFs can be found here (10 pages) and here (2 pages). LO Draw cannot open the former, but it will open the first page of the second graphic. This is probably indicative of limitations in the multi-page TIFF handling ability of Draw and the related graphics filters in LO. The first example file fails silently when attempting to convert to PDF in headless mode (refer A2 below). The second example file comes from bug fdo#63722 and I can confirm that using headless mode to try and convert this particular file to PDF results in only the first page being output.

I would look at alternative tools (e.g., ImageMagick, ghostscript, pdftk, etc.) to handle multi-page TIFFs. In a worst-case scenario (given TIFF variability in production) split them into single-page TIFFs:

$ convert multipage.tif single%d.tif  # this is using the ImageMagick tool "convert"

…then convert them to single-page PDFs, then combine them. Refer A4 below.

A2. The only related bug report I could find about this issue is fdo#59756 although it deals with a specific bind-to-port failure. The overall failure to respond with an error message is similar. I don’t think you are doing anything wrong, but it would not hurt to raise a bug specific to this issue. Matters like the parameter checking example you provide (i.e., missing input file) I would think of as being an EasyHack.

A3. The only thing I can recommend here for Windows is file globbing e.g., soffice.bin -headless -convert-to pdf:writer_pdf_Export -outdir files/ *.odt. You could even use *.* but you run the risk of overwriting output where there are multiple identically named files of different types (extensions). You could work through directories in this manner or script a solution. The exact approach will depend upon the nature of your files, their arrangement / storage, and your conversion requirements.

A4. This really is best handled by separate dedicated tools IMO e.g., pdftk. It is a lot easier than using LO. These types of tools are also smaller and markedly more efficient. Use LO in headless mode when you are dealing with complex documents.

Please report any bugs you raise back here using the format “fdo#123456”. Thanks.

Thank you for your help. I have gone ahead and opened some bug reports and updated the original question. I will mark this as solved (such as it is).

bug reports:

soffice.bin does not report errors on STDERR - fdo#67009

soffice.bin usage prints to modal window instead of STDOUT - fdo#67010

Thanks for doing this. I have confirmed fdo#67009 but fdo#67010 has been RESOLVED as NOTABUG as evidently there is no way under Windows to determine how a program was launched i.e., whether directly via the command prompt, directly via the GUI (by clicking on the executable), or indirectly via the GUI (by clicking on some form of shortcut).