Messy character encoding in .odt moving from Ubuntu to Windows [closed]

asked 2013-05-28 14:04:06 +0100

buiestru

updated 2015-08-24 17:46:18 +0100

Alex Kemp

There's something peculiar that LibreOffice (Writer) does to .odt documents in Ubuntu (12.04, 13.10) which end up completely garbled when opened in Windows 7. Here's a sample:

19, n. 9: référence à Kant chez Husserl et Sartre (voir ch. 3)

Saving to .doc/.docx usually gives perfectly cross-platform legible and editable text. While going from Linux to Windows almost invariably messes up encoding in .odt files, the reverse is not true, that is, Windows authored .odts fare well on both platforms. Setting languages does not make any difference since French and Romanian, which I use most often besides English, both look the same (garbled), irrespective of Writer being told what they are. I should mention that basic latin characters in English render appropriately. But I am trilingual and find this functionality quite basic and strangely lacking for an open format/cross-platform office suite. I've been playing around in Notepad++ which has the ability to change encodings and realised that Ubuntu authored .odts are rendered in Windows as ANSI instead of UTF-8.

Most recently tested on an Ubuntu 32-bit 13.10 with LibreOffice and Windows 7 Enterprise 64-bit with LibreOffice combo.


  1. Is there any way to control character encoding in Writer?
  2. Why do things go well cross-wise with .docx? Is it an .odt thing?
  3. Is it Linux/Debian/Ubuntu specific, since it happens only one-way and not the other (Linux-to-Windows)?

To summarize:

.odt files created in Ubuntu turn out illegible on Windows 7. Everything beyond basic latin (eg: ro, fr) reads nonsense, similar to UTF-8 rendered as ANSI. Strangely, it happens only one-way, from Linux to Windows, and only with .odt.


link:original .odt

link:this is how it should have looked like-Ubuntu

link:this is how it looks like-Windows

Closed for the following reason the question is answered, right answer was accepted by Alex Kemp
close date 2015-11-02 00:28:35.581233


@buiestru - you write: "Saving to .doc/.docx usually gives perfectly cross-platform legible and editable text." Actually, I get much better results using RTF from Word XP or 2010 to LO 3.6+. Worth trying? (This isn't an answer to your question, of course: just a comment.)

David ( 2013-05-28 20:50:34 +0100 )

Thanks, but that still defeats the purpose of .odt. Moreover, I don't use Microsoft Office, this is strictly a LibreOffice to LibreOffice cross-platform thing.

buiestru ( 2013-05-29 00:44:44 +0100 )

answered 2013-05-29 01:09:50 +0100

oweng

updated 2013-05-29 03:41:21 +0100

This is not an LO issue it is a platform encoding issue. Under GNU/Linux what is the output from locale -a and locale charmap in the terminal? The former shows a list of available locales and the latter the current locale. It sounds like your Ubuntu system is not using a UTF-8 encoding and it should be. Further information here.

EDIT: To be clear the OP has indicated that the situation in this question is the cause of the problem reported here.

It's UTF-8 all the way. You have actually answered both of my .odt related questions with the observation that the sample .odt was in fact plain-text. It was that easy. Learned my lesson. With properly created .odt files everything works as it should, formatting and cross-platform wise. I'm quite embarrassed right now :) Thanks!

buiestru ( 2013-05-29 02:25:53 +0100 )

answered 2013-05-28 19:16:06 +0100

mahfiaz

Could you please make a simple example document on you Ubuntu machine and attach it here? I have never encountered such behaviour.

How old is your installation on Ubuntu? Maybe deleting user profile (and thus all settings) helps? (please don't delete the settings folder, just rename it so we could later find out why it does it)

Do you have customized default template file?

