Bug 165694 - FILEOPEN ODT Specific document claimed corrupt (exception: Zip file has holes)
Summary: FILEOPEN ODT Specific document claimed corrupt (exception: Zip file has holes)
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
24.2.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:25.8.0
Keywords: bibisected, bisected
Depends on:
Blocks:
 
Reported: 2025-03-11 19:24 UTC by Aron Budea
Modified: 2025-03-22 17:29 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aron Budea 2025-03-11 19:24:11 UTC
Open attachment 118748 [details] from bug 94247.
Since the below commit in 24.8, it shows a "The file is corrupt and therefore cannot be opened" warning.

https://cgit.freedesktop.org/libreoffice/core/commit/?id=efae4fc42d5fe3c0a69757226f38efc10d101194
https://git.libreoffice.org/core/commit/efae4fc42d5fe3c0a69757226f38efc10d101194
author		Michael Stahl <michael.stahl@allotropia.de>	2024-07-16 12:12:09 +0200
committer	Michael Stahl <michael.stahl@allotropia.de>	2024-07-16 15:57:43 +0200

"package: add additional consistency checks for local file header"

Still occurs in LO Version: 25.8.0.0.alpha0+ (36a6bf8bd65e2ece92a4354e9f6d8e47e0f03d84) / Windows.

Opening the file triggers the following exception:
throw ZipException(u"Zip file has holes! It will leak!"_ustr);
https://opengrok.libreoffice.org/xref/core/package/source/zipapi/ZipFile.cxx?r=22f46bd89f6afc95438acf09fb5ff948f4ca8e02#1530

Testing zip with 'unzip -t' shows no issues.
Tentatively marking as regression, until there's a claim it's the archive that's broken.
Comment 1 Aron Budea 2025-03-11 19:36:51 UTC
To add a few more examples that fail in the same place:
- https://bz.apache.org/ooo/attachment.cgi?id=22418 from https://bz.apache.org/ooo/show_bug.cgi?id=42371
- attachment 90668 [details] from bug 72643 (this is an externally generated file)
- attachment 161921 [details] from bug 133930.
Comment 2 Michael Stahl (allotropia) 2025-03-12 17:49:08 UTC
tdf94247-1.odt:

this one has a data descriptor on the local file header but the flag bit isn't set:

00000000: 504b 0304 0a00 0000 0000 514d 2f47 5ec6  PK........QM/G^.
                         ^ data descriptor bit is not set in 2 flag bytes
00000010: 320c 2700 0000 2700 0000 0800 0000 6d69  2.'...'.......mi
00000020: 6d65 7479 7065 6170 706c 6963 6174 696f  metypeapplicatio
00000030: 6e2f 766e 642e 6f61 7369 732e 6f70 656e  n/vnd.oasis.open
00000040: 646f 6375 6d65 6e74 2e74 6578 745e c632  document.text^.2
                                          ^ DD start
00000050: 0c27 0000 0027 0000 00                   .'...'...
                                ^ DD end

i don't think this is valid.

the meta.xml doesn't contain a generator.
Comment 3 Michael Stahl (allotropia) 2025-03-12 18:01:51 UTC
https://bz.apache.org/ooo/attachment.cgi?id=22418

00000ad0:                                 50 4b03  ...e.........PK.
00000ae0: 0414 0000 0008 006d 694a 3281 ae9f 5a66  .......miJ2...Zf
                 ^^ ^^ flags with data descriptor bit not set
00000af0: 0200 00c8 0800 000b 0000 0063 6f6e 7465  ...........conte
00000b00: 6e74 2e78 6d6c a556 4d8f da30 10bd f757  nt.xml
...
00000d60: b641 1c74 a11b eaf8 1f32 fb0f 504b 0708  .A.t.....2..PK..
                                        ^ DD signature
00000d70: 81ae 9f5a 6602 0000 c808 0000 504b 0304  ...Zf.......PK..
                                       ^ DD end   

same problem, this one is "OpenOffice.org/1.9.77$Linux OpenOffice.org_project/680m77$Build-8871" - unreleased version.
Comment 4 Michael Stahl (allotropia) 2025-03-12 18:13:00 UTC
tdf72643-1.odt, same problem:

00000000: 504b 0304 0a00 0000 0000 257d 8c43 5ec6  PK........%}.C^.
                         ^^^^ DD bit not set in flags
00000010: 320c 2700 0000 2700 0000 0800 0000 6d69  2.'...'.......mi
00000020: 6d65 7479 7065 6170 706c 6963 6174 696f  metypeapplicatio
00000030: 6e2f 766e 642e 6f61 7369 732e 6f70 656e  n/vnd.oasis.open
00000040: 646f 6375 6d65 6e74 2e74 6578 745e c632  document.text^.2
                                          ^ DD start
00000050: 0c27 0000 0027 0000 00                   .'...'...
                                ^ DD end

no generator. the bug says "generated by DokuWiki's ODT export plugin".
Comment 5 Michael Stahl (allotropia) 2025-03-12 18:40:11 UTC
tdf133930-1.odt, now this is a different problem...

$1 = std::__debug::vector of length 1, capacity 2 = {{
    first = 0x7782,
    second = 0x8827
  }}

00007780:      504b 0304 1400 0000 0800 8f78 cc50    PK.........x.P
00007790: 0000 0000 0000 0000 0000 0000 0a00 0000  ................
000077a0: 7374 796c 6573 2e78 6d6c ed5d db72 dbc8  styles.xml.].r..

central directory entry:

0000c840:           504b 0102 3403 1400 0000 0800      PK..4.......
0000c850: 8f78 cc50 334c d484 c310 0000 d1b0 0000  .x.P3L..........
0000c860: 0a00 0000 0000 0000 0100 0000 a481 b1b5  ................
                                             ^^^^ offset
0000c870: 0000 7374 796c 6573 2e78 6d6c            ..styles.xml
          ^^^^ offset

so at position 0xb5b1 we find:

0000b5b0:   50 4b03 0414 0000 0008 008f 78cc 5033   PK.........x.P3
0000b5c0: 4cd4 84c3 1000 00d1 b000 000a 0000 0073  L..............s
0000b5d0: 7479 6c65 732e 786d 6ced 5d6d 939b 3812  tyles.xml.]m..8.

another local file header for a "styles.xml"!

well having 2 of those is certainly a problem.

meta.xml has a generator: "LibreOffice/6.0.1.1$MacOSX_X86_64 LibreOffice_project/60bfb1526849283ce2491346ed2aa51c465abfe6"

possibly the user used some zip tool on the file? a bit concerning if it was actually produced by LO...

oh forgot to mention: all of the files can be "repaired" so i think the package import is working fine for now...
Comment 6 Aron Budea 2025-03-12 19:11:59 UTC
Thanks for checking the samples and the explanation, Michael!

(In reply to Michael Stahl (allotropia) from comment #2)
> tdf94247-1.odt:
> 
> this one has a data descriptor on the local file header but the flag bit
> isn't set:

(In reply to Michael Stahl (allotropia) from comment #5)
> tdf133930-1.odt, now this is a different problem...
> [...]
> another local file header for a "styles.xml"!
> 
> well having 2 of those is certainly a problem.

Are these issues something that could use further consistency checks? Or does the exception about holes occur first, and hides further failures?
Also, would it be possible to refine the message "Zip file has holes! It will leak!" or add an explanatory comment in the code what that could indicate?
Comment 7 Commit Notification 2025-03-18 13:38:18 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/59a18d3c3ca4f4543e0de616eaf7e5bdc4488383

tdf#165694 package: log specific message for missing data descriptor bit

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Michael Stahl (allotropia) 2025-03-18 13:39:23 UTC
okay have added one more specific error message to speed up future investigations
Comment 9 Aron Budea 2025-03-22 15:23:16 UTC
Thanks for that!
Comment 10 Aron Budea 2025-03-22 17:29:33 UTC
(In reply to Aron Budea from comment #0)
> Since the below commit in 24.8, it shows a "The file is corrupt and
> therefore cannot be opened" warning.
And a small correction, the original commit was for 25.2, backported to 24.8 and 24.2.