Writer file corrupt, UTF-16 converts to wrong symbols

Hi folks,

my .rtf file stopped opening in Writer, popup says: “General Input/Output error”

I opened it in WordPad — only 1 page out of 20 is left.

Still, the file weighs 400KB, so I opened it in Notepad only to find 120 pages of text like this:

\u1058’3f\u1072’3f\u1084’3f \u1085’3f\u1072’3f \u1075’3f\u1086’3f\u1088’3f\u1110’3f, \u1085’3f\u1072’3f \u1052’3f\u1072’3f\u1082’3f\u1110’3f\u1074’3f\u1094’3f\u1110’3f, \u1079’3f\u1072’3f\u1094’3f\u1074’3f\u1110’3f\u1083’3f\u1072’3f \u1082’3f\u1072’3f\u1083’3f\u1080’3f\u1085’3f\u1072’3f

Looks like UTF-16, I tried converting it, but for some reason it converts to Myanmar letter. (The lost text is in Cyrillic).

Do you have any ideas what else can be done?

Thank you.

The text you pasted is valid RTF markup. If you put it into a simplest RTF:

{\rtf1
\u1058\'3f\u1072\'3f\u1084\'3f \u1085\'3f\u1072\'3f \u1075\'3f\u1086\'3f\u1088\'3f\u1110\'3f, \u1085\'3f\u1072\'3f \u1052\'3f\u1072\'3f\u1082\'3f\u1110\'3f\u1074\'3f\u1094\'3f\u1110\'3f, \u1079\'3f\u1072\'3f\u1094\'3f\u1074\'3f\u1110\'3f\u1083\'3f\u1072\'3f \u1082\'3f\u1072\'3f\u1083\'3f\u1080\'3f\u1085\'3f\u1072\'3f}

and open in any editor, you will see

Там на горі, на Маківці, зацвіла калина

If you upload the file, I could take a look. By the way: what LO version you use?

Some background. Un RTF, the Unicode characters are encoded using \uN, where N is decimal code. So, when decoding \u1058, you need to look not for U+1058, but for U+0422.

I wonder how many times that should be told: NEVER work in foreign file formats. Only after you have finalized your work, export if necessary.

Same question posted on the AOO forum

Mike Kaganski: thank you very much, I was able to save almost entire file!

gabix: I was helping a theatre director who is not into computers

robleyd: I was in crisis so I posted everywhere, hope I didn’t start a turf war

everyone: thank you