Is spreadsheet corrupt or not?

I created a java app which uses apache-poi library to create an SXSSFWorkbook.
I use LibreOffice Calc as my ‘viewer’ and everything has worked well for 2 or 3 years.
At some point in the past 2 or 3 months I updated my LibreOffice Calc to latest version and it now always warns me that my SXSSFWorkbooks are corrupt when I open them.
It offers to try a repair, I agree and and it successfully repairs the SXSSFWorkbooks.
This is on Windows 11 with latest apache-poi library (5.3.0) and latest LibreOffice Calc (24.8.2).

If I open my SXSSFWorkbooks on a debian computer with LibreOffice Calc 7.4.7.2, the workbooks open with no error.
I just installed an old version (24.2.0.1) of LibreOffice Calc on my Windows 11 computer and the workbooks open with no error.

My questions:
Why do newer versions of LibreOffice Calc report that my workbooks are corrupt?
Can I find out why LibreOffice Calc thinks my workbooks are corrupt?

I tried to open a SXSSFWorkbooks-generated sample that you kindly provided, and couldn’t see a warning.

Hmmm, I didn’t upload a sample workbook so have no idea what you’re refering to.
But will upload a sample workbook with this post:
no_usb_2024-11-06.xlsx (3.6 KB)
On Windows 11, LibreOffice Calc 24.8.2 (and most other recent versions) show a message that the spreadsheet is corrupted and offer to try to repair it:
corrupt_spreadsheet
The repair seems successful and the spreadsheet is opened.

On Debian Bookworm LibreOffice Calc 7.4.7.2, and on Windows 11 LibreOffice Calc 24.2.0.1, I can open the spreadsheet with no problems - neither of these versions complain that the spreadsheet is corrupted.

great you figured it! :stuck_out_tongue_winking_eye:
The change happened in this commit, adding more checks for the sanity of ZIP: https://gerrit.libreoffice.org/c/core/+/170571. It detects, that your “Zip file has holes! It will leak!” - whatever that may mean (I suspect, that your ZIP has unused areas, not referenced by any data).

You may want to file a bug report; but IMO, the bug is in the library creating the package; and having “unused” blocks in a ZIP is spooky, and just anything put there could come unnoticed.

Wow and thanks a lot for all that useful info. :+1:

I agree that as my ‘dodgy’ spreadsheet was produced by apache-poi library, any fix must be made in apache-poi library.
I shall head over to the apache-poi mailing list for help.

tdf#163384
https://bz.apache.org/bugzilla/show_bug.cgi?id=69433
https://lists.apache.org/thread/fr1ddq8j4047jc9c73y531g6o8oqlj4x

1 Like

For anyone looking for just a solution without having to read all the related bug threads…

In apache-poi when you created an SXSSFWorkbook, set the workbook ‘Zip64Mode’ to ‘Zip64Mode.AlwaysWithCompatibility’:

mySXSSFWorkbook.setZip64Mode(Zip64Mode.AlwaysWithCompatibility);

Seeing this comment in the Apache Bugzilla:

This is not really a bug, as PJ Fanning said …

I would say, that you just jump into conclusions too fast. Indeed that is a bug. And if some readers are more forgiving than others, can make one think that anything is good enough - but it’s not. The proper thing for POI is to fix their generator to please all readers, or - rather, to follow the standard correctly. Closing your bug that way is plain wrong and bad.