Bug 148440 - DOCX or ODT with Different First Page header/footer convert-to PDF dies with 100% CPU in Linux
Summary: DOCX or ODT with Different First Page header/footer convert-to PDF dies with ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.4.7.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, perf
Depends on:
Blocks: Performance CPU-AT-100%
  Show dependency treegraph
 
Reported: 2022-04-07 10:17 UTC by max.kirmair
Modified: 2024-04-25 06:39 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Buggy word file that fails with convert-to pdf (229.30 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2022-04-07 10:17 UTC, max.kirmair
Details

Note You need to log in before you can comment on or make changes to this bug.
Description max.kirmair 2022-04-07 10:17:06 UTC
Created attachment 179370 [details]
Buggy word file that fails with convert-to pdf

Hi,
I have been using libreoffice on my Centos 7 server for a long time to convert documents to PDFs.
However, I now have a problem with a Word file. As soon as I try to convert it to PDF, the process hangs. When I use “top” to view my processes, I see that soffice.bin has 100% CPU usage. This does not change even after a longer wait and i have to kill it manually.

my Libreoffice version: LibreOffice 5.3.6.1 30(Build:1)
This version is installed automatically on Centos 7 with
yum install libreoffice-core libreoffice-writer libreoffice-calc libreoffice-impress

What I have tried so far:
- Libreoffice version 7.3.2
- Libreoffice version 7.2.6
- Call with --backtrace
- backtrace with debuginfo-install libreoffice-core-5.3.6.1-25.el7_9.x86_64
With all versions installed, the same behavior is seen with the Word file.
There is nothing helpful in the gdbtrace.log.
As soon as the process hangs, it also writes nothing to the log file.

Here the gdbtrace.log: 
warning: Currently logging to gdbtrace.log.  Turn the logging off and on to make the new setting effective.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffdfffc700 (LWP 2769)]
[Detaching after fork from child process 2770]
warning: Corrupted shared library list: 0x67c6b0 != 0x7ffff7fb8000
[New Thread 0x7fffd84f3700 (LWP 2771)]
[New Thread 0x7fffd7cf2700 (LWP 2772)]
[Detaching after fork from child process 2773]
[New Thread 0x7fffc63c7700 (LWP 2775)]
[Thread 0x7fffc63c7700 (LWP 2775) exited]
[New Thread 0x7fffc63c7700 (LWP 2776)]
[Thread 0x7fffd84f3700 (LWP 2771) exited]


My call looks like this:
/usr/bin/soffice --headless --backtrace --convert-to pdf '/test/input/fail.docx' --outdir '/test'

In the attachments is the document that kills my convert-to pdf

I'm not sure if this is a bug or if I'm just stupid. But I hope in both cases that someone can help me.
Comment 1 Julien Nabet 2022-04-07 11:35:57 UTC
On pc Debian x86-64 with master sources updated today, I just gave a try to pdf export, it hangs.
I noticed these logs repeated:
kill -9warn:legacy.osl:676743:676743:sw/source/core/layout/layact.cxx:574: LoopControl_1 in SwLayAction::InternalAction
warn:sw.core:676743:676743:sw/source/core/view/vdraw.cxx:246: Trying to move anchor from invalid page - fix layouting!

I also tried to convert the docx to odt, it's ok.
Then I tried to convert odt to pdf => hang too.
Comment 2 Timur 2022-04-08 11:01:59 UTC
Not just headless, also GUI export.
44 regression commit 628a60f40283924a2709c308ce276600353605a1
    source-hash-8cf681c5049970573020d8b808c990441b9cf828
    previous source-hash-6e580f3f53ae2de086a08c8ba1958b67874eb9c5
    Author:     Miklos Vajna <vmiklos@collabora.co.uk>
        DOCX import: fix FooterBodyDistance for first pages

CC Mikloš, please see.
Comment 3 Miklos Vajna 2022-05-24 06:33:46 UTC
Thanks for the bisect!

This is a situation where the layout loops, and the layout was already not capable of laying out this document model, just this doc model was not created by the DOCX import previously. You could create an ODT document which builds the same doc model, load it into Writer with an old version before the above commit and it would also loop.

So I agree that this is a bug and it's useful to fix it, but the above commit just makes this more visible, it's not a new problem. Adjusting keywords accordingly.
Comment 4 Timur 2022-05-24 08:38:34 UTC
(In reply to Timur from comment #2)
> Not just headless, also GUI export.
Yes, but Linux only. Windows export works. Also works if Different First Page header/footer turned off in MSO. Adjusting the title.