Bug 155682 - DOCX with big pictures causes endless loop
Summary: DOCX with big pictures causes endless loop
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0.0.5 release
Hardware: All All
: medium normal
Assignee: Miklos Vajna
URL:
Whiteboard: target:24.2.0 target:7.6.3
Keywords: bibisected, bisected, regression
Depends on:
Blocks: CPU-AT-100% DOCX-Floatingtable
  Show dependency treegraph
 
Reported: 2023-06-05 08:33 UTC by Andrey Pivovarov
Modified: 2023-11-08 00:15 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
The file causing the bug to occur. (14.85 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-06-05 08:34 UTC, Andrey Pivovarov
Details
Minimized example file (30.16 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-07-03 22:23 UTC, Gabor Kelemen (allotropia)
Details
The minimal example in Word 2016 and fresh master + process monitor (117.86 KB, image/png)
2023-07-03 22:27 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrey Pivovarov 2023-06-05 08:33:30 UTC
Description:
When I open the attached DOCX file new pages start to being appended to the end of the document and this process seems never end. The DOCX was created in Word 2007 and it contains lots of big pictures that occupy whole page, some of which slightly go outside the page boundaries (Word allows that). The same is with the tables used in the footer of each page.

Steps to Reproduce:
1. Open the attached file
2. The document is successfully loaded
3. Start scrolling it down

Actual Results:
Blank pages endlessly added to the end of the document.

Expected Results:
No new pages added, big pictures are located on their corresponding pages. The best result is also no skewed tables in the footer table, the side table and the page border.


Reproducible: Always


User Profile Reset: No

Additional Info:
At least do not add new pages to the end so that the document would be at least minimally editable.
Comment 1 Andrey Pivovarov 2023-06-05 08:34:59 UTC
Created attachment 187724 [details]
The file causing the bug to occur.
Comment 2 Dieter 2023-06-26 16:11:33 UTC
I confirm the problem. But you should try to narrow it down. So if you think it is caused by big pictures, please delete for example all headers and footers. I've treid to do it, but I couldn't edit the document.

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 069c7dc4e9706b40ca12d83d83f90f41cec948f8
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: CL threaded
Comment 3 Gabor Kelemen (allotropia) 2023-07-03 22:23:52 UTC
Created attachment 188194 [details]
Minimized example file

Managed to push it down to 2 pages of as-char image with frame containing page numbering in the header.
Both images and the header frame are needed for the endless loop, which also increases memory use by ~1 Mb / 2 seconds.
Comment 4 Gabor Kelemen (allotropia) 2023-07-03 22:27:03 UTC
Created attachment 188195 [details]
The minimal example in Word 2016 and fresh master + process monitor

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: e4e5fb4b2935e395c7e4b3a794d544a6f44709ce
CPU threads: 15; OS: Windows 10.0 Build 19045; UI render: default; VCL: win
Locale: de-DE (hu_HU); UI: de-DE
Calc: threaded
Comment 5 Gabor Kelemen (allotropia) 2023-07-03 22:44:58 UTC
This started to loop in 5.0 with commit:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=81ef96a2417c7843dfed0558c920ad3064e58921

author	Miklos Vajna <vmiklos@collabora.co.uk>	2015-06-01 09:03:05 +0200
committer	Miklos Vajna <vmiklos@collabora.co.uk>	2015-06-01 09:14:12 +0200
commit 81ef96a2417c7843dfed0558c920ad3064e58921 (patch)

tdf#79639 DOCX import: don't delay text frame conversion of in-header tables

Before this the layout was a bit different than in Word. Also the page number in the header is inside a floating table.

Adding CC to: Miklos Vajna
Comment 6 ysui2022 2023-07-05 19:48:48 UTC
I confirm the problem.When I opened this file it started to add new pages to the end rapidly and then my Libreoffice crashed.
Version: 7.5.4.2 (X86_64) / LibreOffice Community
Build ID: 36ccfdc35048b057fd9854c757a8b67ec53977b6
CPU threads: 8; OS: Windows 10.0 Build 22621; UI render: Skia/Vulkan; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: CL threaded
Comment 7 Commit Notification 2023-10-25 07:34:40 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9704f61982360ce35983a61cca3fd00bbdf51ab6

tdf#155682 sw floattable: fix DOCX with big pictures causes endless loop

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2023-10-25 09:01:49 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":

https://git.libreoffice.org/core/commit/e2076cf7a92694bc94bdc9f3173c2bddbe881a89

tdf#155682 sw floattable: fix DOCX with big pictures causes endless loop

It will be available in 7.6.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Dieter 2023-11-05 13:48:11 UTC
(In reply to Gabor Kelemen (allotropia) from comment #3)
> Created attachment 188194 [details]
> Minimized example file

Endless loop still there

Statusbar shows 1111 pages at first, then 9955, 16775, 16907, 18854, 19030, ...

Tested with
Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: c0c8cffd3541e3cd616c96791b04e7ebf2b2ed03
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: CL threaded

=> REOPENED
Comment 10 Miklos Vajna 2023-11-06 08:41:10 UTC
Hm, attachment 188194 [details] still shows up here as 2 pages. Can you try in safe mode? Does 'soffice --convert-to pdf minimal.docx' also hang for you?

Possibly it's some additional trick that's needed here to still trigger the problem. And then best to have a new, follow-up bug for that. Thanks.
Comment 11 Miklos Vajna 2023-11-07 08:30:40 UTC
Aron says it also works for him on Linux, so this is OK in general. If we find that this still happens on Windows (or something like that), let's have a follow-up bug bug for it because then that will need a separate fix. Thanks.
Comment 12 Aron Budea 2023-11-08 00:15:17 UTC
I tested with yesterday's daily build, and both files open quickly, no hang.
I adjusted the settings to match the ones in comment 9.

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 0bf60e32c0ac2bf79fad6c29c39c6f6a3f9ce8e8
CPU threads: 16; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: de-DE (hu_HU); UI: en-GB
Calc: CL threaded