Trouble converting large excel spreadsheet

I have a 440MB excel file with 1,040,000 rows and 90 columns. I have tried LibreOffice 7.5.2 and 7.4.6.2. I can open it in calc but have not been able to convert it to an .ODS file.

  • I have tried saving it from Excel to ODS and that fails.
  • I tried the document converter wizard and that fails (the log just says that it failed).
  • I tried opening a CSV file with the same data in Calc and it also failed when trying to save as ODS.
  • I even tried copying rows from Excel to Calc and it only copies 65k rows when I try this.

I looked and it doesn’t seem like I am exceeding the limits of an ODS file or Calc.
Any ideas on what I should try or look at?

tdf#127675

The rest needs the sample file to repro. Consider filing a bug report.

I’m sorry. Are you asking me to provide my Excel file, or to try out that sample excel file from Bug 127675?

The bug report confirms your findings with the clipboard, so no need for you to test that file.
.
Access to your file is necessary to test, if the same problem occurs on other computers, maybe with other OS or different version of LO.
.
As you wrote “fails”, is there any error message, or does the system saves “forever” and never returns. I remember when testing limits some years ago, I sometimes left my laptop active for some hours (preventing sleep-modes) and LibreOffice finished jobs “sometime”.
.
Excel not being able to save sounds interesting, too. I would not expect shared code here…
.

Not to solve your problem… You could try, if Calc can handle first and second half of your data, or smaller amounts, but this wont help on the full file. I’d suggest to load your csv in a real database, either file-based like sqlite or using MariaDB, PostgreSQL etc. Then you may preprocess data for usage in Calc with the database by queries.

I tried to upload the CSV file since I only need to get the data in Calc and don’t need it to be an Excel file, but this forum doesn’t seem to allow text or CSV files. So here is a link to the CSV file instead.
https://drive.google.com/file/d/1-DA5UeY0vQbHrIoTHyXNzorLiPEbyEnp/view?usp=share_link

I can open the file CSV in Calc without issue. When I try to save it as an ODS file Calc I get the " Error saving the document ContosoSales CSV: General Error. General input/output error."

When I try to save as an ODS file from Excel I get a warning that some features in the CSV file may not be supported in the ODS file. I try to save anyway and I get the “Errors were detected while saving .” It suggests I can make repairs but this is simply the CVS file I opened in Excel and then tried to save as ODS file.
I saved it as an .XLSX file and then try to save that as an ODS file and I get the same warning and the then the same error message.

Geez… a 694MB .csv file… Make sure your temporary disk space is large enough, because saving the file as .ods creates a HUGE interim XML stream before being packed. I don’t know how much, my almost 8GB free was not enough. Maybe 16, 32, …

Most sites have 2 restrictions: allowed types and size. As usually only extensions are checked, we can cheat a bit, for example renaming test.oxt to test.oxt.zip
For Size: compressing the file to .zip reduces it to a size wich can be uploaded…

I got it the following way:
1 - Open the cvs in LibreOffice (size A1:CJ1045001)
2- Save in xls and close the file
3 - I opened the xls in LibreOffice and saved it in ods.


(I use Win.10 and LibO.7.5.2.2)

That didn’t quite work. When you save it as .xls it drops most of the rows and only keeps 65536 rows.

Ok, sorry, I didn’t check the result.

CSV opens in about 3’ 15’’, with
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: f22f7159ec1bd28d19f4a8b431893436419ecd64
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US Calc: CL threaded Jumbo
i5-6500 16 GB of memory

Saves as ods, using about 1.6 GB of memory, in 4’ 26’’.

1 Like

Well, yes, but how much temporary disk space did it use? i.e. in /tmp/ or $TMP or Windows %TEMP% (or whatever is used there).

While saving as xlsx, I can find it in
C:\Users\xxxx\AppData\Local\Temp\lu156889fzs8x.tmp
lu15688{059DD2D3-6A6B-4F7C-AC48-B56DD5188A03}.tmp
lu15688{903E389B-AA84-495F-9D03-744427530043}.tmp
lu15688{E5F5D701-DF86-48B4-AD75-BF3B3C04E989}.tmp
C:\Users\xxx\AppData\Local\Temp\lu167563pjnou.tmp
lu16756{A69C6CCF-53FA-467A-8E90-88B64E4DF5F4}.tmp
but they show 0 size, also opening the file properties.

Saving as xlsx has a much higher use of memory up to 12 GB

1 Like

Calc loads “ContosoSales Csv.csv” within acceptable time as 1045000 database records with 88 consistent fields.
Char set: Western Europe (Win 1252)
Language: Any English
Column delimiter: comma
Char delimiter: double-quote
Detect special numbers: Yes (as always)
Columns R and S (LoadDate and UpdateDate) consist of 40179 which you may want to format as date 2010-01-01.

Importing this flavour of csv into some database is easy.

My system is:
Win11
32GB RAM (task manager shows 20GB free when saving to ODS)
I have 300GB free space on my C drive
I have 800GB free space on the drive where the files are strored and being saved to.

Is there something else I should look at?

Nothing likely. I assume you use the 64bit-version of LibreOffice? And it is not likely to find Win11 on an old FAT-Filesystem, wich would limit files/folders to 2 GB. There may be quotas (max. size per user) but I’m not using/testing Win11 at this moment.
.
One thing to try is safe-mode, as others could load your file.
.
While it is interesting to find out, why the file will not save on your system. What do you wish to do afterwards? Wouldn’t it be better to put this in a database, maybe something “simple” as sqlite. (Don’t try embedded databases.) and present Calc only the result of queries.

I am doing some load testing of different tasks in Excel and wanted to try the same tasks in LibreOffice. I wanted to use the same data in both to have the best comparisons.

Performance has always been top priority to the Excel developers. LO Calc can’t really handle this amount of data. However, with Base you could build a customized user interface on top of a database. The development effort would pay off if you had to work with these data structures in the future.

I’m not certain it is specific to Calc. I am able to open the .xlsx spreadsheet that has the same data in it with Calc with no errors. If I try to save to ODS from Excel or Calc it errors out. So it seems like it is an issue with the ODS file type and my data.

Rather, a combination of ODS and Windows, given that @mariosv succeeded on Linux with 16 GB RAM.

ODF uses much longer element names than OOXML. The latter uses single-char element names (c, s, t…) for most often used elements; while in ODF, they are like table:table-cell, office:value-type, text:p, …

The unpacked XML in a XLSX saved from your data in Excel is around 3GB. I guess that the unpacked XML of ODS should be larger then 4GB; and I suspect some difference in the number types in the compilers used to build on Windows vs. on Linux (e.g., long is 32 bit on Windows, even Win64, while it’s 64-bit on Linux64).