Bug 151990 - Memory leak in file conversion
Summary: Memory leak in file conversion
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.2.8.2 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Memory
  Show dependency treegraph
 
Reported: 2022-11-10 08:21 UTC by Guntars
Modified: 2024-11-14 09:09 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Examples of file to reproduce leak and leak example (78.18 KB, application/x-zip-compressed)
2022-11-10 08:21 UTC, Guntars
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Guntars 2022-11-10 08:21:23 UTC
Created attachment 183515 [details]
Examples of file to reproduce leak and leak example

Running file conversion using libreoffice or just soffice executable in linux quickly leak a lot of RAM and use all resources of single CPU thread. 
This leak leaks around ~ 100 MB of RAM per second. CPU usage for single thread is instantly high. File conversion never finishes.

PC:
Ubuntu 16.04.6 LTS
LibreOffice 6.2.8.2 20(Build:2)

To reproduce, run cmd (Conversion from DOCX -> DOC):

'libreoffice' '--convert-to' 'doc' '/tmp/d-194ceec5-f443-48d9-9d1d-f825dc87d7f1/result/gened.docx' '--outdir' '/tmp/d-194ceec5-f443-48d9-9d1d-f825dc87d7f1/result'

The cause:

Seems that when document in word/document.xml contains footnote and text in the same <w:r> tags. To be more specific, this problem only occours when text tag is longer than some limit (see attachments on leak.png and noleak.png) 
There seems to be no leak when footnotes and text are seperated.

Microsoft Word seems to seperate footnotes from text elements.

What would be expected:
If docx has invalid formatting (if this leak is caused by invalid formatting), then there could be 2 outcomes:
1. Process raises some error and exits
2. Libreoffice tries to fix formatting and then converts file
Comment 1 Julien Nabet 2022-11-10 09:46:38 UTC
LO 6.2 is old, please give a try with a recent LO version 7.3.7 or 7.4.2 (see https://launchpad.net/~libreoffice/+archive/ubuntu/ppa).
Also, I think you should upgrade Ubuntu, if you want to stick to LTS, there 22.04.
Comment 2 Guntars 2022-11-10 10:00:50 UTC
I have good news, this leak can be 1:1 reproduced in:

PC:
Ubuntu 22.04.1 LTS
LibreOffice 7.4.2.3 40(Build:3)

P.S. Sometimes it's not eazy to upgrade to newer versions :)
Comment 3 Julien Nabet 2022-11-10 10:27:46 UTC
Thank you for the feedback.

On pc Debian x86-64 with master sources updated today, I could reproduce this.

Remark: with noleak doc, there's no freeze but the result does'nt correspond to docx. Indeed, the first indice link is just "t" instead of the whole indice text "text text text".
Comment 4 QA Administrators 2024-11-11 03:13:13 UTC Comment hidden (obsolete)
Comment 5 Noel Grandin 2024-11-14 09:09:31 UTC
This does not seem to be an issue with current master, not sure when it would have been fixed