Recovering ODT (XML error upon re-opening)

Hi, I have a few .odt files I saved (with FocusWriter actually) that I cannot seem to open again. I know the error probably originated with focuswriter, but I still need to recover my work.

When I open it in focuswriter I get an error saying “Premature end of document” and I can’t see any of the text.
When I open it in libreoffice I get an error saying : “Read-Error.
Format error discovered in the file in sub-document content.xml at 46,713(row,col).”

The same problem with another document produces this : “Read-Error.
Format error discovered in the file in sub-document content.xml at 20,67(row,col).”

I don’t know a ton about the .odt format, but I assume it works similarly to msword in that there’s basically a text file and an xml file which takes care of formatting. And for whatever reason, focuswriter seems to be messing up its own xml files when I try to save it.

I’m using linux mint 17 with KDE (so Kate is my default text editor). Shouldn’t there be a way to open the file in a text editor and be able to at least grab my raw text?

I found this interesting thread online : [Hint] How did I fix my ODT file (View topic) • Apache OpenOffice Community Forum

Their method didn’t work for recovering the documents, but I was able to rip my content out of Kate. I had to do one thing a little different though - after copying the files into a new directory, I had to rename the file from document.odt to document.zip because my terminal program wouldn’t run the unzip command if I didn’t.

If anyone has a proper solution I’d still really appreciate it, but here’s the thread I linked to copied for posterity :

I had problem with 20 pages long ODT
file (text and pictures). Problem was:
when I tryed to open it, I got message
“Error reading file” under OO 2.3.1
(both linux and windows versions).

It took some hours to figure out, how
to fix it, so I want to share my
solution with other OO users :slight_smile:

First, let’s call our nonopening ODT
file as “bad.odt”. make backup FIRST
→ “$ cp bad.odt bad_original.odt” make new directory-> “$ mkdir repair”
copy bad.odt to repair directorty “$
cp bad.odt repair” change default
directory to repair → “$ cd repair”
unzip bad.odt → “$ unzip bad.odt”
after unzipping you get bunch of files
and directory’s under repair , find
content.xml and open it whit your
favorite text editor → “$ kate
content.xml” use “find” function to
find out, if you have XML tag
office:automatic-styles” (somewhere
at the beginning of document) and XML
tag “</office:automatic-styles>”
(somewhere, middle of document). If
you have, then delete them and all
data between them. Be sure, that you
don’t delete more or less! save
content.xml (keep original name and
place!) zip extracted data back to one
ODT document → “$ zip -r
./bad_repaired.odt ./*” try to open
repaired document → “$ ooffice
./bad_repaired.odt” … and if are you
lucky, then OO is able to open your
document again :wink:

Well, I got back my text and pictures,
but the price was - no styles (font
size; bold; heading etc…)

If your document do not opening and
you get message like “Format error
discovered in the file in sub-document
context.xml at …”, then you broke
XML structure and must go back to
“STEP ONE” ant try to be more careful
with deleting things.

PS1: if you get CRC errors, when
unzipping ODT, then my solution
probable can’t help you :frowning: PS2: I
tried also insert “bad.odt” into new
document, but still got “Error reading
file” message :frowning: PS3: and "
META-INF/manifest.xml file " trick did
not help also :frowning:

If someone what to investigate my
broken ODT file, then it can be
downloaded from →
http://adsl213.pointclark.net/Eksam.odt

Do you have searched ask.libreoffice.org? For example the question and answers I can not open my spreadsheet it comes up with this error: Read-Error. Format error discovered in the file in sub-document content.xml at 2,72157(row,col) might help you. And there are some more with helpful links.

What are you doing here? I don’t know a better place to get help on your problem than the forum, and especially the topic, that you link to. It’s one large list of people asking for help to get their damaged file fixed. Quite often it’s fixed in the way you describe, sometimes part of the file got truncated, in that case you can’t recover a lot. Without access to the file it’s impossible to say how to fix it, it all depends on its state.

See the detailed instructions given in [Tutorial] Fixing .docx files with SAXParse error