Ask Your Question
0

character set on unknown CSV-file

asked 2019-02-08 14:36:32 +0200

Albireo gravatar image

My CSV-file is exported from an old DOS-program (Swedish). Now I want to open it in LO-calc. But I don't know how to get all the characters in the right format at once. (eg Swedish character as åäö and characters as é.

When I select Västeuropa (ASCII/USA)(on Swedish) - é is correct but åäö is not, and if I select Västeuropa (ISO-8859-15/EURO)(on Swedish) åäö is correct but é is not.

Think I tested most possible options, but not found right. Is there any possibility to solve this? (I can copy and replace all problem characters in LO but it is maybe ?? characters to replace and must be done everytime.)

edit retag flag offensive close merge delete

Comments

Have you tried ISO-8859-1 which is the European standard and a subset of Unicode. US/ASCII will not recognise many 'Swedish' accented letters.? Windows-1252 is a Windows version of ISO-8859-1 with some characters, mainly accents differing.. When did you create this file using DOS and what standard did you use when creating it? CSV does not tell you the coding used to create the file.

petermau gravatar imagepetermau ( 2019-02-08 16:12:06 +0200 )edit

Yes! (thank you!) I have tried Västeuropa (ISO-8859-1) Västeuropa (ISO-8859-14) Västeuropa (ISO-8859-15/EURO) The characters as .: åäö / ÅÄÖ and so on, is OK!. But still problem with é ( Can't see any differ between these)

Albireo gravatar imageAlbireo ( 2019-02-08 17:13:34 +0200 )edit

And Windows-1252 ? What year did you create the file? What language was the computer you were using. What keyboard? There are other Windows settings depending on the language.

petermau gravatar imagepetermau ( 2019-02-08 18:42:42 +0200 )edit

Try cp437.

gabix gravatar imagegabix ( 2019-02-08 21:04:49 +0200 )edit

Yes! I have even tried Västeuropa (Windows- 1252/WinLatin 1) the result "åäö" no problem but é appears as a "square". I can create the file today, but only from an old computer (WinXP). I think that the program is originally an MS-DOS program (running under Windows). Has not tested running the program under newer OS. The program uses a very old database (BETRIEVE, later PERVASIVE). The language on the computer/program/keyboard is Swedish. The character has, for example HEX C5=Å C4=Ä D6=Ö and HEX 90 =É

Albireo gravatar imageAlbireo ( 2019-02-09 00:07:16 +0200 )edit

X 90 in 1252 is undefined and will give a blank square, likewise ISO-8859-1. Have you tried CP-437 which gives x80 as a Ç ?

petermau gravatar imagepetermau ( 2019-02-09 13:27:24 +0200 )edit

CP-437, though widely used for West/North-European DOS, has different assignments. i.e.

  • 0x8F = Å
  • 0x8E = Ä
  • 0x99 = Ö
  • 0x90 = É

of which only 0x90 = É matches.

Neither matches CP-850 that has the same assignments for these characters.

Good luck.

erAck gravatar imageerAck ( 2019-02-09 20:21:47 +0200 )edit

The closest choice I have in LO-Calc is Västeuropa (DOS/OS2-437/USA)É is correct, but Swedish åäö is not ok!. I have even tried Västeuropa (DOS/OS2-850/internationell) the same here, É is correct but not åäö

Albireo gravatar imageAlbireo ( 2019-02-09 23:51:54 +0200 )edit

Share the file.

gabix gravatar imagegabix ( 2019-02-10 10:01:55 +0200 )edit

How do I do that?

Albireo gravatar imageAlbireo ( 2019-02-10 13:08:49 +0200 )edit

Use any file sharing site such as Dropbox or Yandex.Disk.

gabix gravatar imagegabix ( 2019-02-10 13:48:45 +0200 )edit

I hope this work to two files? link text Dared not process the files, so it would be a little smaller and easier to handle. I could destroy character set in the files.

File1 .: FDT_ART.TXT (that file I ask question about) Some example on the characters ÅÖÉ.On the same row as 001589-045-51 should the word LJUSBLÅ exist. On the same row as "001589-045-58" should the word MÖRKTURKOS exist. On the same row as 101410 should the word ORKIDÉVAS and HÖJD exist.

File2 .: EXPOART.TXT This is an another file with another character set. (maybe easier to find the right character set?) This file is generated by another program, some of the information is the same between the files. The same searches can be done therefore in this file as in the first file with the same result.

Albireo gravatar imageAlbireo ( 2019-02-10 23:40:20 +0200 )edit

1 Answer

Sort by » oldest newest most voted
0

answered 2019-02-09 20:30:14 +0200

erAck gravatar image

Wade through individual pages from https://en.wikipedia.org/wiki/Categor... and see which code the Å character (as it is not that commonly used) has assigned, if it is 0xC5 as you mentioned in one your comments then see if the other characters match.

edit flag offensive delete link more

Comments

Thanks! Maybe I find the right character set, but is it available in LibreOffice? (I havent found the right character set i LO)

Albireo gravatar imageAlbireo ( 2019-02-10 00:20:59 +0200 )edit
Login/Signup to Answer

Question Tools

1 follower

Stats

Asked: 2019-02-08 14:36:32 +0200

Seen: 508 times

Last updated: Feb 09 '19