How to fix a docx file SaxParseException?

Hi! I followed step 3 of these forum to fix a docx file which I cannot enter because of the message SaxParseException. Could you please help me? Spent lots of hours doing the process of zip-xml without success! Thanks in advance.SCR relazione intermedia 2020 a cura degli OP.docx

If you could upload your file, someone may have a look at it. Without investigating the file its almost impossible to give other than general advice (personally I do use Visual Studio Code + XML plugins to investigate such errors. These tools include XML code/format check capabilities, which in general point to the problematic tags in XML code)

Hi! Thanks a lot for the fast answer! That will be great because I tried several solutions but that’s my first experience with xml. I’ll attach the file to my question.

I’ve attached the file right now but I actually do not see it here in the chat. Do you?

No - please use: edit and the paper clip symbol.

You may try this tutorial: [Tutorial] How to fix SAXParse error in LibreOff .docx files (View topic) • Apache OpenOffice Community Forum

Thank you too! I’ll try solution 1 or 2 in the meantime (actually I tried solution 3 before writing the first time on this chat but without success)

Hello,

please check the following file, whether its content matches what you expect:

SCR-relazione-intermedia-2020-a-cura-degli-OP-Fixed.docx

Hope that helps.

If the answers works for you, please consider to click the check mark (:heavy_check_mark:) next to the answer. Thanks in advance …

It helps absolutely! Thank you. I wonder how did you manage to fix it. In the future to prevent I’ll always save documents with other extensions.

I wonder how did you manage to fix it:

  • unzipped the document

  • loaded file document.xml into Visual Studio Code

  • formatted the XML

  • Visual Code XML tool notifies me about format error

  • found the following entries:

     <w:rFonts w:cs="Calibri" w:cstheme="minorHAnsi" w:ascii="Calibri;sans-serif" w:hAnsi="Calibri;sans-serif" w:cstheme="minorHAnsi" />  
    

    which contains w:cstheme="minorHAnsi" twice.

  • Deleted w:cstheme="minorHAnsi" once in all lines where it appeared twice

  • Zipped the content into a new file