# SAXException: [word/document.xml line 2]: Attribute w:eastAsiaTheme redefined

I have a long document I have been working on for the last year. When I tried to open it on Monday it came up with the following error message: An error occurred during opening the file. This may be caused by incorrect file contents. The error details are: SAXException: [word/document.xml line 2]: Attribute w:eastAsiaTheme redefined Proceeding with import may cause data loss or corruption, and application may become unstable or crash. Do you want to ignore the error and attempt to continue loading the file? When I click to say I don't want to continue, the following message comes up: File format error found at SAXParseException: '[word/document.xml line 2]: Attribute w:eastAsiaTheme redefined', Stream 'word/document.xml', Line 2, Column 242204(row,col). Looking on the forums I find I am not the only person faced with this problem. I've tried doing what's suggested, insofar as I can understand the advice, but have not solved the problem. Please help!

Woke in the middle of the night, thinking of other things I should have told you. The document was written using an earlier version of LibreOffice Writer. I’ve tried to find out which version, but haven’t been able to. As advised in one of the posts, I have now updated to 6.1. I have tried to Restore Previous Versions, but none are available. I only realised you needed to tick to activate this feature after the error message appeared. I have read other posts from individuals smitten with the same problem, including someone who suffered a similar disaster just before he was due to submit his thesis. I understand the problem is with xml, but that’s something I know nothing about. Is it possible to submit my corrupted file (when I’ve ignored the warning and opened the file, having saved other copies to hopefully resolve, I’m getting only 13 pages of the 950+ pages I was hoping to see), to someone who could correct this? Is it a bug? If so, does the community know how to resolve it? In case it makes a difference, I’m working in Windows 7, and saving in .docx format.

edit retag close merge delete

Did you also check https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=80923&p=404588#p404588. If you are not familiar with zip, unzip and editing .xml files, you may need to find a trustful person doing that with your original file, since all solutions I've read about so far require a modification of the document.xml file stored in the zip (docx is a zip file).

An btw this question and issue is another reminder to only use native odt format in LibreOffice Writer.

( 2019-06-06 19:03:21 +0200 )edit
( 2019-06-06 19:06:56 +0200 )edit

And please note that that tutorial is only useful in part of my comment there, unfortunately. See the regex there.

( 2019-06-07 07:46:49 +0200 )edit

Dear Mike, I don't understand your comment. Which comment is it you are referring to? And what is a regex?

( 2019-06-07 08:05:12 +0200 )edit

I’ve just found out how to save my document as a .zip file, which enables me to view all the xml coding. I’m concerned that if I do this with my +900 page file, it will be so long, and finding the problem-causing glitch will be extremely difficult. Can I do a word search with the .xml version? Does '[word/document.xml line 2]’ help narrow things down? Or Column 242204(row,col)? Or could I just search for ‘w:eastAsiaTheme’, and is it just a question of removing that line, or do I need to ‘redefine' it, and if so, how?

( 2019-06-07 08:21:20 +0200 )edit

@Brazilnut: sorry for being unclear; I referred to the tutorial mrntioned by @Opaque just above; your document may be totally restored, without any loss, only in case of manual editing of the docx, otherwise some part of document might be lost.

In the mentioned tutorial, I wrote a comment about proper regular expression allowing to fix this error automatically.

( 2019-06-07 08:21:33 +0200 )edit

Sorry to be a pest, but this is all very new to me. What do you mean by proper regular expression? Can I also refer you to the second paragraph I've this morning added to my initial enquiry, which may or may not help, and also my comment about xml editing I've added most recently. Any help very gratefully received!

( 2019-06-07 08:28:12 +0200 )edit

I refer to this comment from the tutorial:

So please install Notepad++ to edit your word/document.xml unpacked from the archive; search for (\<[^>]+)([\w]+:[\w]+="[^"]+")([^>]+)\2, and replace with $1$2$3, making sure that regular expressions option is active in the replacement dialog. Then put the edited xml back into the archive, replacing old file there, and rename the archive to .docx; then open it. Or you may provide your document here, I may do that for you. ( 2019-06-07 09:21:56 +0200 )edit If you could that would be absolutely great: I'm wary of venturing too deeply into unknown waters. How would I go about getting a copy of my document to you? Preferably not making it viewable to everybody else? ( 2019-06-07 10:05:02 +0200 )edit You may send it to mikekaganski@hotmail.com ( 2019-06-07 10:46:18 +0200 )edit ## 1 Answer Sort by » oldest newest most voted To have the question answered: The problematic file was emailed to me, and fixed by editing the XML source (word/document.xml), using regex (\<[^>]+)([\w]+:[\w]+="[^"]+")([^>]+)\2 replaced by$1$2$3.

The problematic file was generated by LibreOffice 5.4.3.2 (WinX86_64). Resaving the fixed file using 6.2.4.2 (x64) does not reproduce generation of invalid XML, so likely fixed in the meanwhile. An advice to always store files in native ODF format, and only export to DOCX when sending to people unable to read ODF, was given.

more