What character set to use when saving an ods as a csv in English?

I tried to save a single sheet ods file as a csv file on Windows 10 but when I opened the csv in Libreoffice calc the contents were replaced with symbol characters in a single row. I though perhaps the ods was corrupted but the same thing happened when I created a new spreadsheet. Reinstalling Libreoffice with the latest version 6.2.8 didn’t fix the problem. Do I need to change the character set to something other than Western Europe (Windows-1252/WinLatin 1)?

Calc does not do an extensive scan of the content to determine data type, but trusts the filename extension to determine storage format. This means that filename extension must be consistent with the actual storage format used.

Did you select Text, CSV from the file type dropdown when you saved? Was the automatic extension box ticked, or did you otherwise ensure that extension was “.csv”?

It is not sufficient to alter the filename extension to “.csv” and assume that storage format follows extension, which some users do.

If you used Windows-1252 encoding to save a CSV file and reading it results in garbage it can have at least two causes:

  • there were other characters than just English Latin (ASCII) characters, that either can not be saved in Windows-1252 encoding, or when reading the file a different encoding was chosen
  • when reading the file a multi-byte encoding such as UTF-16 was chosen

Thank you for your quick reply keme, I chose “text CSV” in the dialog box that appeared after pressing “Save As”. Following gabix’s answer, I found setting UTF-8 when saving as a csv then choosing UTF-8 when opening the csv file works (but opening it with UTF-16 gives the garbled symbols). However, UTF-16 out and UTF-16 in works. Western Europe (Windows-1252/WinLatin 1) out and Western Europe (Windows-1252/WinLatin 1) in also works.

Thank you erAck, you’re absolutely right

To be on the safe side, always use Unicode (UTF-8 or UTF-16).

UTF-8 here is the best choice though, as there is no difference if really only English (ASCII) characters are used (though that’s also the case with Windows-1252 encoiding). UTF-16 forces a two-byte encoding on all characters, which can lead to problems if the consumer is not set to read UTF-16.