Bug 162682 - FILEOPEN Writer claims a certain DOCX is corrupted
Summary: FILEOPEN Writer claims a certain DOCX is corrupted
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
24.8.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: DOCX-SAXParse Regressions-ooxml-handle-first-hdr-ftr
  Show dependency treegraph
 
Reported: 2024-08-29 10:15 UTC by Salih
Modified: 2024-09-16 11:23 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
damage word file (296.46 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2024-09-05 06:25 UTC, Salih
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Salih 2024-08-29 10:15:08 UTC
Description:
While a Word file is being converted to a PDF file, soffice.exe gives me a 'damaged Word file' error that occurs in the new version. But when I try to do this with the old version (7.6.7.2), I don't get any error.

Actual Results:
I can share with you that is created word file

Expected Results:
I can share with you that is created word file


Reproducible: Always


User Profile Reset: No

Additional Info:
I shouldn't get any error
Comment 1 m_a_riosv 2024-08-29 13:38:08 UTC
Please attach a sample file, reduce the size as much as possible without private information, and paste the information in Menu/Help/About LibreOffice, there is a copy icon.
Comment 2 Salih 2024-09-05 06:25:55 UTC
Created attachment 196242 [details]
damage word file
Comment 3 m_a_riosv 2024-09-05 21:42:08 UTC
Reproducible with
Version: 24.2.0.0.alpha1 (X86_64) / LibreOffice Community
Build ID: 06946980c858649160c634007e5fac9a5aa81f38
CPU threads: 16; OS: Windows 10.0 Build 22631; UI render: Skia/Vulkan; VCL: win
Locale: es-ES (es_ES); UI: es-ES
Calc: CL threaded

up to
Version: 24.8.1.1 (X86_64) / LibreOffice Community
Build ID: ef51c4a0cd35185debf25ad9d0db6a1c14bed5a0
CPU threads: 16; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US
Calc: CL threaded

But not with
Version: 7.6.7.2 (X86_64) / LibreOffice Community
Build ID: dd47e4b30cb7dab30588d6c79c651f218165e3c5
CPU threads: 16; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US
Calc: CL threaded

neither with master
Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 1b61a0737e3600aadf42f28a15c70aface9ab61e
CPU threads: 16; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US
Calc: CL threaded

Seems something in master has solved the issue but it has not backported.
Still with
Version: 24.8.2.0.0+ (X86_64) / LibreOffice Community
Build ID: dde9a8b6c6841e89bec958dcfa1bb36d8728598f
CPU threads: 16; OS: Windows 11 X86_64 (10.0 build 22631); UI render: Skia/Vulkan; VCL: win
Locale: es-ES (es_ES); UI: es-ES
Calc: CL threaded
Comment 4 Aron Budea 2024-09-15 05:33:43 UTC
Assuming the problem isn't related to PDF export, as the file can't be opened in Writer. There are three relevant commits here.
First, opening the file started hanging since the following commit in 24.2:

https://git.libreoffice.org/core/commit/7d08767b890e723cd502b1c61d250924f695eb98
Author:     László Németh <nemeth@numbertext.org>
AuthorDate: Mon Oct 16 19:39:30 2023 +0200
Commit:     László Németh <nemeth@numbertext.org>
CommitDate: Tue Oct 17 10:37:23 2023 +0200

    "tdf#130088 tdf#119908 smart justify: fix DOCX line count + compat opt."

Then, after the following commit, also in 24.2, the file can't be opened anymore, Writer claims it's corrupted:
https://git.libreoffice.org/core/commit/4b0fa253a4540f5461397815d290586f9ddabe61
Author:     Tomaž Vajngerl <tomaz.vajngerl@collabora.co.uk>
AuthorDate: Tue Nov 28 13:46:21 2023 +0900
Commit:     Tomaž Vajngerl <quikee@gmail.com>
CommitDate: Fri Dec 1 08:26:38 2023 +0100

    "tdf#136472 adjust ooxml import to handle first header/footer"

And finally, in current master branch, towards 25.2, the following commit fixes the issue when opening the file:
https://git.libreoffice.org/core/commit/14599c131345cc8847cdb72cfd6fe9e8239d5d1f
Author:     Noel Grandin <noel.grandin@collabora.co.uk>
AuthorDate: Mon Aug 19 14:23:16 2024 +0200
Commit:     Noel Grandin <noel.grandin@collabora.co.uk>
CommitDate: Mon Aug 19 18:10:09 2024 +0200

    "tdf#158556 only set header flags once"

Noel, do you think backporting this fix to 24.8/24.2 is safe?
Comment 5 Noel Grandin 2024-09-16 09:49:35 UTC
(In reply to Aron Budea from comment #4)
> And finally, in current master branch, towards 25.2, the following commit
> fixes the issue when opening the file:
> https://git.libreoffice.org/core/commit/
> 14599c131345cc8847cdb72cfd6fe9e8239d5d1f
> Author:     Noel Grandin <noel.grandin@collabora.co.uk>
> AuthorDate: Mon Aug 19 14:23:16 2024 +0200
> Commit:     Noel Grandin <noel.grandin@collabora.co.uk>
> CommitDate: Mon Aug 19 18:10:09 2024 +0200
> 
>     "tdf#158556 only set header flags once"
> 
> Noel, do you think backporting this fix to 24.8/24.2 is safe?

No, according to Justin Luth, header/footer import for DOCX is a morass, and that commit might cause other regressions.
Comment 6 Justin L 2024-09-16 11:23:02 UTC
(In reply to Noel Grandin from comment #5)
> No, according to Justin Luth, header/footer import for DOCX is a morass, and
> that commit might cause other regressions.
And that header/footer opinion is based on the meta bug 161381 that was created just for that one commit.