Split fields in Calc with the delimiter | , when loading a text file - problem

I have a problem when I want to read a delimited textfile to Calc.
The text file looks like this .:

Date .:|2022-06-23|Time .:|11:23:43|Row 1 info .:|,,,,"","34,25"|From program 1 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|11:23:43|Row 2 info .:|"","","","","",3|From program 2 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|11:23:43|Row 3 info .:|"","2,5","3","4",,|From program 3 .:|Test - FileAppend - Calc.ahk

My wish is to be able to open this text file with LibreOffice Calc.
The fields (columns) are divided by | as a field delimiter.
When the above content is opened with LO Calc v7.2.5.2
The first line is divided as desired (correct)
The second line is divided correctly to “Row 2 info .:”
Then the following (the rest) ends up in cell F2

",",",",",3|From program 2 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|09:41:07|Row 3 info .:|","2,5","3","4",,|From program 3 .:|Test - FileAppend - Calc.ahk

(I have tested an old Excel and an old OpenOffice - no problem with split the fields correctly)
What to do? (change the delimiter or…?)

this inputfile has no problem i LO Calc

Date .:|2022-06-23|Time .:|12:03:51|Row 1 info .:|"info 1","info 2","info 3","info 4","info 5"|From program 1 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|12:03:51|Row 2 info .:|"info A","info B","info C","info D","info E"|From program 2 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|12:03:51|Row 3 info .:|"1","2","3","4","5"|From program 3 .:|Test - FileAppend - Calc.ahk
Date .:|2022-06-23|Time .:|12:03:51|Row 4 info .:|"1,5","2,5","3,5","4,5","5,5"|From program 4 .:|Test - FileAppend - Calc.ahk

Your data has paired quotes, which have special meaning in CSV (as a general structured text format). So to avoid treating the pairs as escaped quotes, remove the " from String delimiter in the import dialog:

1 Like

Try ticking the Tab, Comma and Other…


Windows 10 home, LO 7.3.4.2

I understand, but why should LO Calc care about the quote characters in each fields?(when I only use the specified field divider | )

I have tested other field dividers such as TAB (comma is not appropriate) but with the same result (with LO Calc )

Are you replying to @gregors15?

The Format quoted field as text is unrelated, as it only tells how to format the resulting cell when it’s quoted, not how to process the quoting in the context of the content of the field. When you have characters set as string delimiters (actually, “escaping symbols”), they are processed as the CSV standard prescribes, and that includes treating the paired symbols as self-escaping.

It’s even a little more complicated… in

|"","","","","",3|From ...

with only | field separator and " “string delimiter” (confusing term there, it’s the quote character) the first " at field start opens a quoted field, the following ", then is accepted as literal data because so many CSV data files are generated broken and do not follow the escape embedded quotes by doubling them rule (the result in this case though is the same as if all "" were treated as escaped double quotes); and the then subsequent "", are taken as escaped doubled quote and comma, resulting in ", data, and all following including the embedded line feed is still part of that field’s data because there is no closing quote that would be followed by either a | field separator or line feed.

Hence, if you tell in the dialog that no (or choosing one completely unused) quote character shall be used as “string delimiter”, all double quote characters become part of data.

2 Likes

I made the following choices

  1. ) **UTF-8** (correct)
    
  2. ) **|** as a field delimiter (correct)
    
  3. ) **Format quoted field as text** (it must be correct)
    

bild
Since I did not select Format quoted field as text (Formatera citeratfält som text) point3, I did not think that field delimiters would play a role.
How can I select “no String delimiter”?

(If I select ’ as string delimiter it works in this example - but but I dont want to use a string delimiter)
bild

Just hit Delete with the character selected, or Backspace to remove it.

2 Likes

See also: tdf#147858

2 Likes