We’re doing some text analysis and for that I’m using Libreoffice to convert from .docx → .txt
The documents we’re analysing contain cross references and seem to be completely fine in Writer, but once I’m exporting to TXT, the cross reference appears twice.
The text might look like this in Writer:
"In Exhibit (A) we [..]"
But exported to txt it’s
"In Exhibit (A)(A) we [..]".
To reproduce, please download the attached .odt file, open with Writer and Save As… → TXT. You will find the string “Anlage (A)(A)” instead of just “Anlage (A)”.
I’m not sure why this is happening. My original sources are contract docx files and unfortunately this happens very regularly. I can not reproduce this behaviour by creating a new writer file and inserting a cross reference. It only happens when a docx is converted to txt. Examining the content.xml file, I can see the reference twice with different styling, but to be honest I don’t know much about the format or internal document representation.
Any thoughts about this? Thanks for reading!
P24_D_double_reference.odt (23.5 KB)