Import csv file, ignore string delimiter while importing

Hi,
I am trying to import a csv file
but string delimiter in import options is causing me issues

if a cell value is starting with character used as string delimiter is will merge all remaining cells into one

Here is an example


col0;col1;col3

test0;"test1;test2

test3;test4;test5


Values test2, test3, test4, test5 will be added to test2 cell.
But if I write t"est1 than it is ok. Why?

Seems like a basic thing, but dont know how to fix it.

Any help is appriciated

Thank you

I am using LibraOfficeCalc Version: 6.1.5.2 on OS: Linux 4.19;

on OS: Linux 4.19

Nitpicking: Linux 4.19 isn’t an OS but a kernel release number. Kernel 4.19 + LibreOffice 6.1.5.2 makes me guess, that you are on OS Debian 10 (Buster).

tdf#125110

Yep, I am on debian system :). I copied the info from the help window

… which also says Build ID: 1:6.1.5-3+deb10u6 :slight_smile:

:slight_smile: :smiley: you got me

Hello,

on opening the csv file you can set the string delimiter to anything or nothing (leave it empty)
inside the “Text Import” window.

Hope that helps.

To show the community your question has been answered, click the ✓ next to the correct answer, and “upvote” by clicking on the ^ arrow of any helpful answers. These are the mechanisms for communicating the quality of the Q&A on this site. Thanks!

2 Likes

Yes that is true - tried that

but i need something more generic.
How to completely ignore separator, doesn’t matter if it is “” or '.
And just use delimiter (terminator), in this case ;

i think you can just clear the String delimiter field … leave it empty that should do what you want.

I’ll updaete my answer to reflect that info.

Hope that helps.

To show the community your question has been answered, click the ✓ next to the correct answer, and “upvote” by clicking on the ^ arrow of any helpful answers. These are the mechanisms for communicating the quality of the Q&A on this site. Thanks!

OMG, i didnt see that you could delete it
saw dropdown menu, and figured only predefined values are available
Thank you

@djulb: Glad to help out. Have an awesome day.

The work around is great. Thank you.
I think this behaviour should be marked as a bug. I am importing pipe delimited data with several blocks of text. If a double quote appears as the first character anywhere in the file, the import fails from that point on.

This works
Comments with "quoted phrase", followed by rows of pipe delimited values
a|b|c
d|e|f

This fails
"Quoted phrase is in the first position". This line and following rows will be imported into one cell
a|b|c
d|e|f

In the example you give that fails the " is not set as a string delimiter because it doesn’t enclose the field. There is either another character as string delimiter, in which case set that. Or it is not needed because the pipe appears only as a field separator and nowhere else, in that case don’t set a string delimiter as the answer above says

Hi EarnestAI
I see.
I have simplified the example somewhat. My file is about 2MB with various operating system and SQL interspersed. I should probably go back to the source data extract and clean it up. The fail line has matching quotes. Interestingly these two lines import cleanly allowing the rest of the document to process ok.

"Quoted phrase is in the first position". and has a trailing quote"
a|b|c
d|e|f

"Quoted phrase is in the first position". and has a trailing quote"|
a|b|c
d|e|f

The quotes inside the field must be escaped in the CSV. Your example is invalid CSV syntax. See 2.7 in RFC 4180.

See also tdf#144842 and tdf#125110.

Hi Mike,
Agreed. This is not a CSV file. It is a bunch of pipe delimited data dumps concatenated from several databases with some text between each section. The quote characters are spurious, appearing in between each dump. and in my case should be removed. I can modify the original script or use grep/awk to tidy it up.
The fix offered by Igortius suits me, I can clear the string delimiter field on import. The popup dialog does not make it an obvious choice for newbies but that is a minor matter.
The behaviour I am describing occurs for Clipbard Paste(Text Import), File Open and for Data, Text To Columns.
It seems inconsistent to me having worked with MSExcel (also with gotchas) for many years.

Thanks for for your comments.
Paul

:slight_smile: I used the “CSV” term not in a strict sense (and I realize that I was inconsistent, meaning a broad “structured text format” as LibreOffice knows it, but referencing CSV standard at the same time).

tdf#147858

As you may see in tdf#125110 (comments 7 and 13), there are so many ways the structured data may be created incorrectly, and so the logic to tolerate some of the most often encountered brokenness would necessarily conflict with other broken patterns. In your case, the double quotes are not used as string delimiters - so removing string delimiters is the proper way of dealing (not a workaround). The quotes may be removed later from Calc cells using Find & Replace (or preprocessed, as you say).