# character set on unknown CSV-file

My CSV-file is exported from an old DOS-program (Swedish). Now I want to open it in LO-calc. But I don't know how to get all the characters in the right format at once. (eg Swedish character as åäö and characters as é.

When I select Västeuropa (ASCII/USA)(on Swedish) - é is correct but åäö is not, and if I select Västeuropa (ISO-8859-15/EURO)(on Swedish) åäö is correct but é is not.

Think I tested most possible options, but not found right. Is there any possibility to solve this? (I can copy and replace all problem characters in LO but it is maybe ?? characters to replace and must be done everytime.)

edit retag close merge delete

Have you tried ISO-8859-1 which is the European standard and a subset of Unicode. US/ASCII will not recognise many 'Swedish' accented letters.? Windows-1252 is a Windows version of ISO-8859-1 with some characters, mainly accents differing.. When did you create this file using DOS and what standard did you use when creating it? CSV does not tell you the coding used to create the file.

( 2019-02-08 16:12:06 +0200 )edit

Yes! (thank you!) I have tried Västeuropa (ISO-8859-1) Västeuropa (ISO-8859-14) Västeuropa (ISO-8859-15/EURO) The characters as .: åäö / ÅÄÖ and so on, is OK!. But still problem with é ( Can't see any differ between these)

( 2019-02-08 17:13:34 +0200 )edit

And Windows-1252 ? What year did you create the file? What language was the computer you were using. What keyboard? There are other Windows settings depending on the language.

( 2019-02-08 18:42:42 +0200 )edit

Try cp437.

( 2019-02-08 21:04:49 +0200 )edit

Yes! I have even tried Västeuropa (Windows- 1252/WinLatin 1) the result "åäö" no problem but é appears as a "square". I can create the file today, but only from an old computer (WinXP). I think that the program is originally an MS-DOS program (running under Windows). Has not tested running the program under newer OS. The program uses a very old database (BETRIEVE, later PERVASIVE). The language on the computer/program/keyboard is Swedish. The character has, for example HEX C5=Å C4=Ä D6=Ö and HEX 90 =É

( 2019-02-09 00:07:16 +0200 )edit

X 90 in 1252 is undefined and will give a blank square, likewise ISO-8859-1. Have you tried CP-437 which gives x80 as a Ç ?

( 2019-02-09 13:27:24 +0200 )edit

CP-437, though widely used for West/North-European DOS, has different assignments. i.e.

• 0x8F = Å
• 0x8E = Ä
• 0x99 = Ö
• 0x90 = É

of which only 0x90 = É matches.

Neither matches CP-850 that has the same assignments for these characters.

Good luck.

( 2019-02-09 20:21:47 +0200 )edit

The closest choice I have in LO-Calc is Västeuropa (DOS/OS2-437/USA)É is correct, but Swedish åäö is not ok!. I have even tried Västeuropa (DOS/OS2-850/internationell) the same here, É is correct but not åäö

( 2019-02-09 23:51:54 +0200 )edit

Share the file.

( 2019-02-10 10:01:55 +0200 )edit

How do I do that?

( 2019-02-10 13:08:49 +0200 )edit

Use any file sharing site such as Dropbox or Yandex.Disk.

( 2019-02-10 13:48:45 +0200 )edit

I hope this work to two files? link text Dared not process the files, so it would be a little smaller and easier to handle. I could destroy character set in the files.

File1 .: FDT_ART.TXT (that file I ask question about) Some example on the characters ÅÖÉ.On the same row as 001589-045-51 should the word LJUSBLÅ exist. On the same row as "001589-045-58" should the word MÖRKTURKOS exist. On the same row as 101410 should the word ORKIDÉVAS and HÖJD exist.

File2 .: EXPOART.TXT This is an another file with another character set. (maybe easier to find the right character set?) This file is generated by another program, some of the information is the same between the files. The same searches can be done therefore in this file as in the first file with the same result.

( 2019-02-10 23:40:20 +0200 )edit

Sort by » oldest newest most voted

Wade through individual pages from https://en.wikipedia.org/wiki/Categor... and see which code the Å character (as it is not that commonly used) has assigned, if it is 0xC5 as you mentioned in one your comments then see if the other characters match.

more