Error importing csv file

I have been importing csv files provided by major bank computers into Excel for years. When I import them into Calc, Version: 5.3.4.2 (x64), Build ID: f82d347ccc0be322489bf7da61d7e4ad13fe2ff3, there are parsing errors. Here are some lines from a csv file.

211,04/06/17,"PAYMENT THANK YOU
",857.06 ,(468.59),"5,884.58 ","1,106.77 ",

212,04/07/17,"BJ WHOLESALE #0053 WAPPINGERS FA NY
",52.85 ,(521.44),1,"5,937.43 ","1,106.77 ",

213,04/11/17,"WAL-MART #1810 FISHKILL NY
",96.54 ,(617.98),1,"6,033.97 ","1,106.77 ",

214,04/12/17,TU *TRANSUNION 800-493-3292 CA,16.17 ,(634.15),"6,033.97 ","1,106.77 ",

The problem is in the third field, which is an alphabetic description field. The last line, where the third field is not quoted, imports correctly. The first three lines show the third field on two lines, doubling the height of those three rows. Excel does not do this!

Inspection of those lines with an editor which displays hexadecimal codes shows that the bank’s computer sticks an Hex0A or ASCII 10 line feed character immediately before the closing double quote. Your import algorithm incorrectly interprets that as a line return. Within a quoted field line feed characters should probably be ignored.

Paul_L

Hello @Paul_L,

“Within a quoted field line feed
characters should probably be
ignored.”

As a workaround you could try :

  1. Make a copy of your csv-file,
  2. Find inside your csv-file all occurrences of the linefeed character directly followed by a double quote mark,
  3. Replace all found occurrences by a temporary unique string ending in a double quote mark, ( e.g. My_Unique_LF_string" ),
  4. Import the edited csv-file in Calc,
  5. If the import succeeds, then you can Find all occurrences of the temporary unique string in your Sheet, and Replace them by a linefeed character again.

Thanks. That’s about what I did do yesterday. You should re-evaluate your algorithm’s handling of the control characters in a quoted string.

You’re welcome @Paul_L,

btw the algorithm is not mine in any way, i’m just a casual user just like yourself ( as most of us here are ).

Import is not wrong. Within quoted fields line breaks are part of the data of the field, that’s how CSV is defined. See RFC 4180 and also its Informative References if interested in details.

If Excel does not import the line break then it might be that it ignores a trailing line break, assuming that the creator of the CSV was too stupid, but would fail if the trailing line break was intentional and relevant data.