Why does LibreOffice hang when trying to open a sorted CSV file

Hello,

I’ve been having a couple of recurring issues trying to open CSV files with LibreOffice on Mac OS X.

The biggest issue I have is that LibreOffice will hang when trying to open CSV files after I’ve sorted them (although this seems to happen only with certain files). What is odd about this, however, is that LibreOffice was able to open them fine prior to sorting and removing duplicates.

I am sorting and removing duplicates from the CSV file using the following command:

sort -u -t, -k1,1 file.csv > file-unique.csv

This command sorts by the first field, which contains the record id # of each record in my file, and removes any duplicate records. LibreOffice has no problems opening file.csv (the unsorted file), but it cannot open the sorted version of the file (file-unique.csv), which is dramatically smaller in filesize and contains only unique records. When I attempt to open it with LibreOffice, the application just hangs until I force quit it.

The other issue I’m having which isn’t so bad but is still somewhat of a nuisance is the fact that my Finder windows stop responding 75% of the time when I save documents in LibreOffice. As soon as I save the document, finder will stop responding, forcing me to re-launch it. I have been able to reproduce this behavior repeatedly and the Finder window stops responding only when I save documents in LibreOffice.

Does anyone why this is? Is there a way to get LibreOffice to stop hanging and open the CSV file? I’m just really stumped as to why it had no problem opening the very large version of the file with duplicate records, but it cannot open a much smaller, sorted and de-duped version of the file containing records that were also present in the original… I also don’t understand why it makes my Finder window hang when I save documents…

Below is my LibreOffice version info:

Version 4.0.1.2 (Build ID: 84102822e3d61eb989ddd325abf1ac077904985)
TinderBox: MacOSX TDF Release, Branch:libreoffice-4-0, Time: 2013-02-27_17:15:43

I am running Mac OS X 10.6.8.

I appreciate any help I can get on this. Thanks for taking the time to read and respond to this question.

I seem to have discovered the issue. For some odd reason, there was some unclosed double quotes around some of the field values. I don’t know why these were like this in the sorted version of the file and not the original, but removing the double quotes from the file fixed the problem.

@a564 Could you file a bug for this and attach the .csv file that makes Calc hang? A small test case with an unbalanced quote didn’t reveal anything. A quoted field actually may embed line feeds and makes the field continue on the next line until a matching quote is encountered. So if you sort lines you may break such an arrangement and depending on data and number of lines following Calc might try to stuff everything into one cell and either just consume much time on that or fail somehow.

(comment → Answer for resolution)

@a564 writes:

I seem to have discovered the issue. For some odd reason, there was some unclosed double quotes around some of the field values. I don’t know why these were like this in the sorted version of the file and not the original, but removing the double quotes from the file fixed the problem.