Created attachment 196421 [details] Exemple file opening certain files with the extension ‘xlsx’ is indicated as corrupted Name : libreoffice-calc Epoch : 1 Version : 24.8.0.3 Release : 1bdk_mga9 Architecture: x86_64 Install Date: ven. 23 août 2024 12:20:31 Group : Office/Spreadsheet Size : 40604549 License : MPL-2.0 and Apache-2.0 and LGPL-3.0-only and LGPL-3.0-or-later and CC0-1.0 and BSD-3-Clause and (LGPL-2.1-only or SISSL) and (MPL-2.0 or LGPL-3.0-or-later) and (MPL-2.0 or LGPL-2.1-or-later) and (MPL-1.1 or GPL-2.0-only or LGPL-2.1-only) Signature : DSA/SHA1, ven. 23 août 2024 03:53:19, Key ID d1e9294d2d9835d8 Source RPM : libreoffice-24.8.0.3-1bdk_mga9.src.rpm Build Date : ven. 23 août 2024 02:43:17 Build Host : GamerRyzen7 Packager : katnatek Vendor : BDK-packagers URL : https://www.libreoffice.org/ Summary : LibreOffice Spreadsheet Application Description : The LibreOffice Spreadsheet application.
Created attachment 196422 [details] screenshot
The same file does not appear corrupted on version 7.6.7.2
Regression introduced by: commit efae4fc42d5fe3c0a69757226f38efc10d101194 [log] author Michael Stahl <michael.stahl@allotropia.de> Tue Jul 16 12:12:09 2024 +0200 committer Michael Stahl <michael.stahl@allotropia.de> Tue Jul 16 15:57:43 2024 +0200 tree 5e7fe7051a76f04b1b8b2ab9c46c271e3f8ff666 parent 2f81046033bb4082f888edfa94685d2dcc2689aa [diff] package: add additional consistency checks for local file header Bisected with: bibisect-linux64-25.2
I tried to open the document with Excel 2016 and it opens it without any complain
(In reply to Xisco Faulí from comment #4) > I tried to open the document with Excel 2016 and it opens it without any > complain Yes, the problem occurs only with libre office
hmm ... apparently this was produced by "Apache POI"? the problem is we detect an 8 byte gap following the data descriptor of every zip entry... it looks like the data descriptor uses 64-bit sizes, but there is no Zip64 extra field on the local header, the extension length is 0... there does not appear to be a Zip64 extra field anywhere in the file, nor is there a Zip64 end of central directory record ... how is one supposed to know these sizes are 64-bit?
the file does look invalid to me, 64-bit data descriptor but no zip64 extra field: 4.3.9.2 When compressing files, compressed and uncompressed sizes SHOULD be stored in ZIP64 format (as 8 byte values) when a file's size exceeds 0xFFFFFFFF. However ZIP64 format MAY be used regardless of the size of a file. When extracting, if the zip64 extended information extra field is present for the file the compressed and uncompressed sizes will be 8 byte values. and in any case, the file is opened by LO in "Repair" mode, so i think that's good enough, resolving NOTABUG for now. (the Repair mode appears to "guess" if it's zip64 based on a following signature) POI would be using Apache Commons-Compress; the code to write the data descriptor is in https://github.com/apache/commons-compress/blob/master/src/main/java/org/apache/commons/compress/archivers/zip/ZipArchiveOutputStream.java protected void writeDataDescriptor(final ZipArchiveEntry ze) throws IOException { if (!usesDataDescriptor(ze.getMethod(), false)) { return; } writeCounted(DD_SIG); writeCounted(ZipLong.getBytes(ze.getCrc())); if (!hasZip64Extra(ze)) { writeCounted(ZipLong.getBytes(ze.getCompressedSize())); writeCounted(ZipLong.getBytes(ze.getSize())); } else { writeCounted(ZipEightByteInteger.getBytes(ze.getCompressedSize())); writeCounted(ZipEightByteInteger.getBytes(ze.getSize())); } } contains the obvious check that there is a Zip64 extra field - which the attached file doesn't have. this has been substantially changed since 2011 when Zip64 support was introduced: https://issues.apache.org/jira/browse/COMPRESS-150 really not clear how this file was produced...
Thanks for the detail. Actually, this file is generated by Apache POI I solve this issue using org.apache.poi.xssf.streaming.SXSSFWorkbook#setZip64Mode and setting it to Zip64Mode.Never, to force not compress. Now LO do not complain any more.