Created attachment 105137 [details] File in zip64 format which causes load error. Problem description: When attempting to open an XLSX file compressed using the ZIP64 format LO gives an error stating the file is damaged. After unzipping the file and then re-compressing LO is then able to open the file. The ZIP version in the header of the original file is 0x2D while the files produced by LO contain 0x14 as the version. Steps to reproduce: 1. Attempt to load an XLSX file in ZIP64 format. Current behavior: Error stating that file is damaged. Expected behavior: Loading the file. Operating System: Linux (Other) Version: 4.3.0.4 release
Created attachment 105138 [details] Modified (rezipped) file that opens correctly.
(In reply to comment #0) > Current behavior: > Error stating that file is damaged. Confirmed under GNU/Linux using: - v4.3.0.4 Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0 - v4.4.0.0.alpha0+ Build ID: e379401618268ed7f7f5885a36b90e1f4f6cd4af TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-08-18_05:51:03 Status set to NEW.
Confirmed that this also affects files created with https://xlsxwriter.readthedocs.org/ if use_zip64 option is used (which is mandatory for large files).
This bug is also confirmed on version: Version: 5.0.0.5 Build ID: 00m0(Build:5) running on Linux Mint 17.2. Is zip going to be updated anytime soon? I'm running into a lot of files that have been compressed with zip64 and unzipping and rezipping is not really an end-user kind of thing. This is one of those "little" things that keep some local governments (I work for a municipality) from switching completely from a Microsoft-laden environment.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.1.5 or 5.2.1 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20160920
As requested, I've confirmed that this bug still exists and it's behaviour is the same as reported originally. I verified using LO 5.2.1.2 running on a Kubuntu 16.04 64-bit system (linux kernel version is 4.4.0-38-generic).
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.4.1 or 5.3.6 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20170929
Created attachment 147159 [details] XLSX without ZIP64 This file can be read in LibreOffice and Gnumeric.
Created attachment 147160 [details] XLSX with ZIP64 This file can be read in Gnumeric, but fails in LibreOffice.
This bug is still valid in LibreOffice Version: 6.1.3.2 (CPU threads: 16; OS: Linux 4.19; UI render: default; VCL: gtk3; Locale: en-US (en_US.UTF-8); Calc: threaded). I have created 2 files with the same content using Apache POI. One file is ZIP64 and can be read by Gnumeric, but not LibreOffice. The other file has been written without ZIP64 and can be read by LibreOffice 6.1.3.2
Looks like LibreOffice (Version: 6.0.7.3, Build ID: 1:6.0.7-0ubuntu0.18.04.2) has a hard requirement on just the zip version field. But only in Central directory's "version needed to extract" (see: https://en.wikipedia.org/wiki/Zip_(file_format)#Central_directory_file_header). Looks like this version needs to less of equal to 30. Other version fields can be 45: Local file header's version and "version made by" in central directory. For a Excel and LibreOffice compatible zip64 compressor implementation see: https://github.com/rzymek/opczip/blob/master/src/main/java/com/github/rzymek/opczip/Zip64Impl.java
still repro in Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 7a0e0a84a02f505200331c19b28d45e898cd5a12 CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: ru-RU Calc: threaded Jumbo
*** Bug 98836 has been marked as a duplicate of this bug. ***
When will this BUG be solved?
*** Bug 143958 has been marked as a duplicate of this bug. ***
File bugs which should be marked as a duplicate of this bug, in debugging the exception should be in: https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile.cxx?r=d0a8d4a9#946
For ZIP64 specs, see https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html and https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
The easiest way to identify whether an xlsx (zip) for is in ZIP64 format, under linux, seems to be: $ xxd ./test.xlsx 00000000: 504b 0304 2d00 0000 0800 4155 6e48 3c5c PK..-.....AUnH<\ 00000010: b548 ffff ffff ffff ffff 1300 1400 5b43 .H............[C ...... Each column is 2 bytes According to https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html: 1. The first 4 bytes, "504b 0304" indicates that it is a zip file. 2. Bytes 19-22 denotes to "Compressed size". If archive is in ZIP64 format, then this is "ffff ffff". 3. Bytes 23-26 denotes to "Uncompressed size". If archive is in ZIP 64 format, then this is also "ffff ffff".
(In reply to Kevin Suo from comment #18) > According to https://users.cs.jmu.edu/buchhofp/forensics/formats/pkzip.html: > > 1. The first 4 bytes, "504b 0304" indicates that it is a zip file. > 2. Bytes 19-22 denotes to "Compressed size". If archive is in ZIP64 format, > then this is "ffff ffff". > 3. Bytes 23-26 denotes to "Uncompressed size". If archive is in ZIP 64 > format, then this is also "ffff ffff". I don't think reading zip file headers is necessary, as it seems to be already read in https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile.cxx?r=d0a8d4a9#938 where nCompressedSize and nSize should correspond to the "Compressed size" and "Uncompressed size" above. (In reply to Kevin Suo from comment #16) > File bugs which should be marked as a duplicate of this bug, in debugging > the exception should be in: > https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile. > cxx?r=d0a8d4a9#946 And here is tests if nCompressedSize or nSize is "0xffffffff" to see if Zip64 is needed, throwing exception if yes.
(In reply to Ming Hua from comment #19) I pointed this out to help QA to determine whether a certain such bug is due to the using of backslash as file name separator (bug 76115), the use of ZIP64 (this bug), or other reasons, and mark as duplicate to the correct bug accordingly. Yes you are right, currently if the file is zipped using ZIP64, then it throws an exemption in https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile.cxx?r=d0a8d4a9#950. That is a FIXME and should be implemented.
(In reply to Kevin Suo from comment #20) > I pointed this out to help QA to determine whether a certain such bug is due > to the using of backslash as file name separator (bug 76115), the use of > ZIP64 (this bug), or other reasons, and mark as duplicate to the correct bug > accordingly. For QA purpose I think there are easier ways to identify files using zip64 format. On Linux zipinfo has "-v" option which outputs a line like A subfield with ID 0x0001 (PKWARE 64-bit sizes) and 16 data bytes: d3 02 00 00 00 00 00 00 ef 00 00 00 00 00 00 00. for each file that is compressed with zip64 format. On Windows, 7-zip UI for archive content has a column indicating the file is using zip64 format.
From a person migrating from MSO to LibreOffice, I understand that this (7-years old bug) is a key issue stoping them because the ERP software they used, Kingdee, uses zip64 format when generating xlsx files, and they simply can not open any xlsx files exported from their ERP using LibreOffice. A workaround is to unzip their xlsx, and then zip again. The following code comment: // FIXME64: need to read the 64bit header instead In https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile.cxx?r=d0a8d4a9#946 indicates that this is a core feature not yet implemented.
Attila Szűcs committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/abda72eeac19b18c22f57d5443c3955a463605d7 tdf#82984 tdf#94915 zip64 support (import + export) It will be available in 7.6.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
The issue seems to be solved in 7.6 for comment 0's Test-Original.xlsx. However, I still could not open comment 9's test62872_Zip64Mode.Always.xlsx.
(In reply to Justin L from comment #24) > However, I still could not open comment 9's test62872_Zip64Mode.Always.xlsx. Can that file really be opened by Gnumeric, as claimed in comment 9?
(In reply to ady from comment #25) > Can that file really be opened by Gnumeric, as claimed in comment 9? It is problematic, but recoverable in MS Excel 2010, while the non-zip64 version from comment 8 (test62872_Zip64Mode.AsNeeded.xlsx) opens without any warning. Given that test62872_Zip64Mode.Always.xlsx is not a valid file, I think this issue can be closed as fixed.