Difference between document converted using CLI (--convert-to) and GUI (Save As)

Hi,

There seem to be difference between document (DOCX) generated using command-line (–convert-to docx:“MS Word 2007 XML”) and through GUI (actually opening the document in LibreOffice and doing a “Save As” => Microsoft Word 2007/2010 XML(.docx) ) if the input document is DOCX format created using Microsoft Office.

The Table of Contents (TOC) is getting lost if the conversion is using command-line whereas it is preserved if done through GUI. If we compare document.xml after extracting the documents converted through these two formats, a large portion corresponding to Table of Contents is missing.

Expectation is that same ooxml import/export filter code path should be executed resulting in identical files.
Could anyone shed a light why the files could be different? Is it that code to handle Table of Contents is separate than rest of ooxml import/export?

regards,
Roopesh Kohad

PS: I am testing compatibility of OOXML files which are saved in LibreOffice against MS Office. Hence this type of use case :slight_smile:

Thanks oweng for you answer!
I have gone ahead and filed a defect in any case here - https://bugs.freedesktop.org/show_bug.cgi?id=70481
If you could confirm it, that would be great!

I have added some tests, confirmed the bug, and marked it as a regression. The results of my testing highlighted problems that vary between v3572 and v4122.

I have resolved fdo#70481 as a duplicate of fdo#67005.

The filters are indeed different. Exactly what those difference are will require a Developer to clarify. This question deals with a similar issue (vector graphics in a StarWriter file not being handled via headless mode). You can see in the list of filters that there are distinct UI versions of each. While these files are not the actual parsing code, they are indicative of a difference of some sort.