SAXException attribute w:themeShade redefined

Hi,
My mum was working on a document (docx) that had originally been created using Word. She has made significant changes to the document and is now unable to open the document because of the error in the title of this question.

I have read that it is possible to open the zip of the document and interrogate the contents of the document.xml. I have been able to extract this, but am unable to locate the issue as I am not too familiar with xml files.

It seems as though there will be multiple definitions of w:themeShade… is that correct? And is it possible to fix this?

Is the document beyond recovery?

Thanks.

The error is easy to fix when the document is here; see my comments to [Tutorial] Fixing .docx files with SAXParse error (only the regex to fix duplicates there is relevant; please ignore the “tutorial” itself).

Thanks Mike. I appreciate your response. I followed through option 1 in your link and was successfully able to get AOO to open the file without errors. However, there are only 18 pages and my Mum is certain that there should be 63.

Oh. I explicitly asked you to ignore the “tutorial” itself, and only check the comment about regex… and you write about option 1. :frowning:

So you did. Sorry Mike I didn’t scroll down far enough on the page that you linked. I have now used the regex solution i.e.

Replace:
(<[^>]+)([\w]+:[\w]+="[^"]+")([^>]+)\2
With:
$1$2$3

I did a search and replace all and it found and replaced 11 occurrences.

I then added the modified file back into the zip. The document now doesn’t open because of w.themeColor redefined instead of w:themeShade.

Should I just do a search and replace of all <[^>]+> to just extract the text?

Please repeat replacement until there’s no more occurrences. I cannot help you more without the document at hand - just the general suggestions.

Update:
I tried the search and replace again. But this time without the “Wrap around” option enabled. It again found 11 occurrences, but this time it has worked! Thank you so much Mike! My Mum thought everything she had been working on for the last few months had been lost.

How do I mark your original response as the answer to the question?

Well - just close the question as “right answer is accepted” or something like that. Glad it helped :slight_smile:

OK. Thanks I’ll do that. I think that it worked the second time as I hit “replace all” repeatedly until it said no more occurrences as you suggested. So thanks again. And sorry for not reading your original comment properly ;).

Answer was provided by Mike.

I needed to perform the following search and replace using Regex in Notepad++ (repeatedly until “no more occurrences” were found) on the extracted document.xml and then add the edited xml back into the zip container.

To remove the most-often encountered
problem: redefined attributes - it’s
better to replace regex

(<[^>]+)([\w]+:[\w]+="[^"]+")([^>]+)\2

with

$1$2$3

in document.xml (works with
Notepad++). The modified file then
needs to be added back to the OOXML
(using a ZIP application like e.g.
7-zip), and then it will most probably
open OK.