Libreoffice fails to open table prepared in LibreOffice and saved as WORd doc.x file

I have been workiing on a table using LibreOffice but saved as a doc.x file. I saved the fileseveral times as I worked on it and re-opened it several times also. This morning it would not open and got the message:-
File Error found at File format error found at
SAXParseException: '[word/document.xml line 2]: Attribute w:cstheme redefined
', Stream ‘word/document.xml’, Line 2, Column 50192(row,col).

How can I retrieve the file - or at least some of it? There is about 6 hours work gone into it

I have just realised I logged a similar event last year with an earlier version of Writer. Its had a lot of views but no response - so I am not optimistic about this either. I have reverted to using my Netbook computer which has Microsoft WORD installed. I cannot afford to waste time entering stuff on LibreOffice if I can’t retrieve it

XML parsers are very unforgiving, when you enter something that violates the rules of correct XML, the code will be refused, and editors and programs using that XML will refuse to open it, even for editing to correct the error. You can extract the file that holds the actual content and remove the offending XML tag to see if it’s okay then. Alternatively, sign up at Index page • Apache OpenOffice Community Forum and post in the discussion thread [Hint] How did I fix my ODT file (View topic) • Apache OpenOffice Community Forum - some of the gurus may be able to help you out.

After searching and reading through many posts, I found that some of the explanations pointed in the right direction, but did not explain what the real problem was. Here’s the problem description.

It doesn’t matter which XML element is causing the problem. The real issue is that in the XML element, there is an attribute that is defined, but that same attribute is defined a second time within the element. That is what OpenOffice/LibreOffice complain about - the second definition of the attribute. The problem is usually in the style.xml file (after you’ve unzipped the DOCX file), but may be in the document.xml file too.

To fix this:

  1. Unzip the .docx file into a temporary directory
  2. Using an XML Editor open the problematic XML file - I used Netbeans since I have it on my computer;
  3. I specifically look for the offending attribute appearing twice in the same XML element - the error message provided you a hint: in this thread, the attribute is w:cstheme, but it could anything else (in my case it was wval)
  4. Remove the second attribute with that name from that line - you should then have only one attribute with that name in that XML element
  5. Save the XML file
  6. Zip up the contents of that temporary folder into a .docx file outside the temporary directory
  7. Now read the file again in OO/LO. if you see the error show up again with another attribute name, repeat this process for the offending attribute, ad infinitum until it works.

In the event your DOCX file is password-protected, you have another problem - while OO/LO will prompt you for the password, it will fail with the “File format error”. Trying to unzip the .docx file fails because the content is encrypted so you can’t solve it with the method above. To solve this:

  1. Attempt to open the password-protected DOCX file - it will prompt for the password - supply it
  2. When the “File format error” appears, do NOT click on OK
  3. Open up a Shell window - you ARE doing this on a Linux desktop, aren’t you???
  4. Change directory to the /tmp folder
  5. Execute the ls -ltr command
  6. The last entry will be a directory with a funky name like lu345434p0c.tmp
  7. Go into that directory and list the files
  8. You will find two file with similar looking names - one is a zero-length file (which you can ignore), but the other is the DOCX file that has your unencrypted content
  9. Copy that file to a temporary folder outside the directory where you are
  10. Create another temporary directory and unzip the contents of the DOCX file you just copied - you are now at Step #1 of the process described above
  11. Fix your offending XML attribute problem and - voila - you have your DOCX file in OO/LO again.

Moral of this story?

NEVER USE DOCX FILES!! Why would you do this when ODT just works??

Hope this helped.

Kidding aside about the Linux desktop, it will likely also work on an OS-X desktop - it is based off of BSD UNIX and it does have a Shell tool anyway! It might probably work on Windows too, but I’m guessing you’ll have to figure out where OO/LO open their temporary files - c:\Temp maybe?