Changing character set of UTF-16 CSV file when importing causes crash

Whenever I try to change the character set of a CSV file I am importing into LibreOffice Calc, LibreOffice begins using 100% of my CPU and then crashes LibreOffice. This issue occurs on two different operating systems (Ubuntu and Windows 10) and it occurs with LibreOffice Version: 6.0.7.3 and Version 6.3 (the only two I’ve tested). This issue seems to only occur with UTF-16 CSV files. Here is what happens:

  1. I download a CSV file from a specific work website.
  2. I double-click the file to open it.
  3. LibreOffice Calc opens a pop-up form, prompting me indicate the “Character set”, “Language”, “From row”, “Separator Options”, “Other Options”, and “Fields”. The default “Character set” is “UTF-16”.
  4. In the “Character Set” field, I select “UTF-8” , then, before I click “OK” or do anything else, LibreOffice freezes, then several minutes later it crashes or I kill it.

LibreOffice crashes every time I try to change the character set of a CSV file that has a default “UTF-16” character set.

This issue does not seem to occur with other, seemingly non-UTF-16, CSV files. Also, with these non-problematic files, in Step 3, under the “Fields” section, they display the first few rows of data, and that data is garbled or looks like Mandarin/Chinese, but if I change the “Character Set” to “UTF-8” in the popup when importing, then instantly the few rows of data that displays in the popup renders to English. It is as if the data in those files is meant to be read in “UTF-8”. However, with the problem CSV file, in Step 3, under the “Fields” section, the first few rows of data displays properly with the default UTF-16 character set. It is as if the problem CSV file is supposed to be read in UTF-16, but I need it to be displayed and saved as UTF-8.

So, the only workaround I know of in order to make the file display and save as a UTF-8 character is to do the following:

  1. open the CSV file as “UTF-16”
  2. click “Save As”
  3. select the “Edit Filter” checkbox
  4. Select 'UTF-8"
  5. Select “Ok/Save”.
  6. Close the file.
  7. Open the newly saved UTF-8 file.
  8. The pop-up displays, asking me to indicate the “Character Set”. It is already set to “UTF-8”, so I just click “OK” and all is well.

So, why does LibreOffice crash when I open the original CSV file and try to import it as anything except “UTF-16”?

Why am I able to import the file as “UTF-16” and then save it as “UTF-8”?

How can I change the character set in LibreOffice? Can it be done from the command-line? I read the help files and several internet posts related to this topic, but they did not seem to answer these questions.

Some additional info: If I run file MyGoodUtf8File.csv, then I get UTF-8 Unicode text, with very long lines, but if I run file MyBadUtf16File.csv, then I get data. According to this post, data means that the file command was unable to figure out anything about the file.

UPDATE:

I found another workaround on this web page that is easier than the workaround I presented earlier:
iconv -f UTF-16LE -t UTF-8 <filename> -o <new-filename>

Share a sample file.

Can it be done from the command-line?

Search this site, issues of command-line conversion have been discussed quite many times. For example, here. However, if the file in question is problematic, command line conversion won’t go.

A crash is a bug, please report in the bug tracker with a short, precise and concise (no prose text) description with the exact steps to reproduce, and attach a crashing sample file to the bug.

Thanks.