Bug 131651 - FILEOPEN DOCX: LO steadily increases memory usage until system freezes
Summary: FILEOPEN DOCX: LO steadily increases memory usage until system freezes
Status: RESOLVED DUPLICATE of bug 76260
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
5.4 all versions
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, haveBacktrace, regression
Depends on:
Blocks: DOCX-Opening Memory
  Show dependency treegraph
 
Reported: 2020-03-28 19:47 UTC by Bjorn Wastvedt
Modified: 2021-02-15 19:23 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
first document (44.90 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-04-04 12:57 UTC, Bjorn Wastvedt
Details
second document (40.13 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-04-04 12:58 UTC, Bjorn Wastvedt
Details
Perf flamegraph (503.75 KB, image/svg+xml)
2020-05-04 15:47 UTC, Buovjaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bjorn Wastvedt 2020-03-28 19:47:27 UTC
Description:
My work requires me to open many Writer files simultaneously: say 5-10 100kb documents (between 20 and 200 pages each). In addition to these, I have two Calc files at about 500kb apiece. So, they're not huge documents, though I suppose they're pretty complicated (cross-linked spreadsheets and multi-language word processing documents with lots of formatting).

LO (newly reinstalled 6.4.1.2 on Debian 10/buster) has no problem opening all of these. I organize them in several workspaces, alongside a bunch of pdfs and some gedit documents and finally get to work. Ideally, I don't close these windows or shut down the system, partly because it takes so long to get everything organized for optimal productivity.

But after a day or so of saving, editing, etc., LO uses all of my memory and Debian is no longer responsive. If I'm lucky, I can switch to a virtual terminal and kill the process from there. Even then I risk losing unsaved work.

I've spent some time looking at previous, similar bug reports with no luck. How can I get LibreOffice to continue functioning without using all 8GB of available memory?


Steps to Reproduce:
1. Open lots of LO files. (Could it be a problem with one of the specific files I am using?)
2. Without closing, continue normal word processing editing for several hours.
3. Watch steadily more memory being consumed.

Actual Results:
I'm including the output of ps -eo pid,cmd,%mem,%cpu --sort=-%mem | head -4 run every fifteen minutes over a period of several hours. You can see that while I'm using some other applications (notably Firefox), the memory usage for those applications rises and falls, while the memory usage for LO steadily increases. If I were to wait a couple more hours, the system would become unresponsive.


11:08:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 14.3  4.0
17501 /usr/lib/firefox-esr/firefo  6.0  1.0
17409 /usr/lib/firefox-esr/firefo  5.5 10.5

11:23:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 17.0  5.1
17501 /usr/lib/firefox-esr/firefo  6.1  0.9
17409 /usr/lib/firefox-esr/firefo  5.2  9.0

11:38:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 19.8  6.3
17409 /usr/lib/firefox-esr/firefo  5.7  8.6
 1273 cinnamon --replace           3.8  0.9

11:53:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 22.4  7.4
17409 /usr/lib/firefox-esr/firefo  7.3  9.7
17501 /usr/lib/firefox-esr/firefo  4.0  1.0

12:08:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 25.2  8.5
17409 /usr/lib/firefox-esr/firefo  6.9 10.0
17501 /usr/lib/firefox-esr/firefo  4.8  1.0

12:23:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 28.3  9.5
17409 /usr/lib/firefox-esr/firefo  7.5  9.0
 1273 cinnamon --replace           3.8  0.9

12:38:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 31.5 10.6
17409 /usr/lib/firefox-esr/firefo  9.0  8.2
 1273 cinnamon --replace           3.8  0.9

12:53:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 34.5 11.6
17409 /usr/lib/firefox-esr/firefo  7.8  8.1
29241 /usr/lib/firefox-esr/firefo  4.7 13.0

13:08:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 37.0 12.6
17409 /usr/lib/firefox-esr/firefo  7.1  8.9
17501 /usr/lib/firefox-esr/firefo  4.9  0.9

13:23:09
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 36.6 13.5
25059 /usr/lib/firefox-esr/firefo 10.6  3.7
17409 /usr/lib/firefox-esr/firefo  7.5 10.3

13:38:10
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 40.1 14.4
17409 /usr/lib/firefox-esr/firefo  6.4  9.9
17501 /usr/lib/firefox-esr/firefo  5.1  1.0

13:53:10
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 43.0 15.3
17409 /usr/lib/firefox-esr/firefo  6.4  9.4
17501 /usr/lib/firefox-esr/firefo  5.1  0.9

14:08:10
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 45.7 16.2
17409 /usr/lib/firefox-esr/firefo  6.3  9.3
 1273 cinnamon --replace           3.5  1.0

14:23:10
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 48.7 17.0
17409 /usr/lib/firefox-esr/firefo  6.4  9.0
17501 /usr/lib/firefox-esr/firefo  4.4  0.9

14:38:10
  PID CMD                         %MEM %CPU
10506 /usr/lib/libreoffice/progra 51.4 17.9
17409 /usr/lib/firefox-esr/firefo  7.5  9.1
 4431 /usr/lib/firefox-esr/firefo  5.0  7.3

Expected Results:
Well, I'd expect LO to just continue running without using an increasing amount of memory.


Reproducible: Always


User Profile Reset: No



Additional Info:
OpenGL is not enabled. I'll try resetting my user profile; didn't know that was a potential fix.

In one thread, someone recommended sharing inxi -Fz:

System:    Host: bjorn Kernel: 4.19.0-8-amd64 x86_64 bits: 64 Desktop: Cinnamon 3.8.8 Distro: Debian GNU/Linux 10 (buster)
Machine:   Type: Laptop System: LENOVO product: 4291ZL2 v: ThinkPad X220 serial: <filter>
           Mobo: LENOVO model: 4291ZL2 serial: <filter> UEFI [Legacy]: LENOVO v: 8DET61WW (1.31 ) date: 04/25/2012
Battery:   ID-1: BAT0 charge: 33.5 Wh condition: 33.5/62.2 Wh (54%)
CPU:       Topology: Dual Core model: Intel Core i5-2520M bits: 64 type: MT MCP L2 cache: 3072 KiB
           Speed: 1684 MHz min/max: 800/3200 MHz Core speeds (MHz): 1: 2995 2: 3095 3: 2990 4: 2990
Graphics:  Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics driver: i915 v: kernel
           Display: x11 server: X.Org 1.20.4 driver: modesetting unloaded: fbdev,vesa resolution: 1366x768~60Hz
           OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile v: 3.3 Mesa 18.3.6
Audio:     Device-1: Intel 6 Series/C200 Series Family High Definition Audio driver: snd_hda_intel
           Sound Server: ALSA v: k4.19.0-8-amd64
Network:   Device-1: Intel 82579LM Gigabit Network driver: e1000e
           IF: enp0s25 state: down mac: <filter>
           Device-2: Intel Centrino Wireless-N 1000 [Condor Peak] driver: iwlwifi
           IF: wlp3s0 state: up mac: <filter>
Drives:    Local Storage: total: 689.33 GiB used: 85.64 GiB (12.4%)
           ID-1: /dev/sda vendor: Seagate model: ST500LM021-1KJ152 size: 465.76 GiB
           ID-2: /dev/sdb model: DOGFISH SSD 240GB size: 223.57 GiB
Partition: ID-1: / size: 27.37 GiB used: 10.71 GiB (39.1%) fs: ext4 dev: /dev/sdb1
           ID-2: /home size: 183.80 GiB used: 74.05 GiB (40.3%) fs: ext4 dev: /dev/sdb6
           ID-3: swap-1 size: 7.88 GiB used: 893.6 MiB (11.1%) fs: swap dev: /dev/sdb5
Sensors:   System Temperatures: cpu: 82.0 C mobo: N/A
           Fan Speeds (RPM): cpu: 3785
Info:      Processes: 247 Uptime: 4d 2h 38m Memory: 7.68 GiB used: 6.95 GiB (90.5%) Shell: bash inxi: 3.0.32
Comment 1 Bjorn Wastvedt 2020-03-28 21:03:12 UTC
Update: appears to be happening in safe mode too (i.e., is not a problem with the user profile). After closing everything and restarting LO in safe mode, I opened everything back up and continued checking:

15:23:10
  PID CMD                         %MEM %CPU
17409 /usr/lib/firefox-esr/firefo  5.2  9.0
17501 /usr/lib/firefox-esr/firefo  3.8  0.9
 7842 /usr/lib/firefox-esr/firefo  3.6  2.6

15:38:10
  PID CMD                         %MEM %CPU
 9696 /usr/lib/libreoffice/progra  7.9 57.3
17409 /usr/lib/firefox-esr/firefo  6.9  9.3
 7842 /usr/lib/firefox-esr/firefo  4.6  2.0

15:53:10
  PID CMD                         %MEM %CPU
 9696 /usr/lib/libreoffice/progra 10.4 77.5
17409 /usr/lib/firefox-esr/firefo  8.5  9.7
17501 /usr/lib/firefox-esr/firefo  5.2  1.0

15:59:29
  PID CMD                         %MEM %CPU
 9696 /usr/lib/libreoffice/progra 11.5 80.8
17409 /usr/lib/firefox-esr/firefo  6.6  9.9
17501 /usr/lib/firefox-esr/firefo  6.6  1.0
Comment 2 Julien Nabet 2020-03-29 10:21:59 UTC Comment hidden (obsolete)
Comment 3 Bjorn Wastvedt 2020-03-30 00:59:38 UTC
(In reply to Julien Nabet from comment #2)
> Debian testing proposes 6.4.2
> 
> Do you MS and ODF formats or only ODF formats?
> 
> About all your files, are there some which are only for reading only?
> If yes, perhaps you may convert them into pdfs so they may consume less
> memory.
> Of course, check that PDF conversion is ok.

Only ODF formats (ODS, ODT).

Hmm, well I already converted some to pdfs. Evince takes so little memory, that would definitely help. But to do that with a bunch of the files, I'd need a better PDF editor, so I could highlight text, make comments, etc. The need for such functionality is why I've kept them in libreoffice.

But really, this increasing memory usage isn't normal, right!?
Comment 4 Julien Nabet 2020-03-30 11:16:55 UTC Comment hidden (obsolete)
Comment 5 Bjorn Wastvedt 2020-03-30 19:02:28 UTC
(In reply to Julien Nabet from comment #4)
> To investigate, we'd need an example of case which increases memory so we'd
> have a minimal step by step process to reproduce this.
> Indeed, as it is, the bugtracker won't help much since it just indicates
> there are memory leaks in LO, nothing else. Devs already know it but need
> more information to tackle each leak.

Hi, Julien, and thanks for the reply. I'm making some progress on this:

I monitored the memory usage while closing a couple of the files at a time. I *think* that I've found the file that must be the problem. It'd be best to be able to attach it here, but I can't, since it's a text I'm using by permission from someone who will end up publishing it later.

But, here are some specifics about the suspect file:

**Written in .doc format, saved by me into libreoffice and annotated after that. (Maybe I should have said this in response to your first comment?)

**Almost entirely polytonic Greek, with some English and other languages mixed in.

**32 pages, 90000 characters, 500+ footnotes

I searched for problems involving special characters and memory usage in libreoffice and found this: https://ask.libreoffice.org/en/question/157664/slow-typing-and-high-cpu-usage-after-a-certain-point/, which suggests that using shift-enter for a newline can cause an issue. Well, it looks like in my document everything is paragraphed correctly; at least, when I turn on view->formatting marks, I can see paragraph symbols after each paragraph.

Let me know if you need the file itself to comment further. I could probably obfuscate the contents somehow.
Comment 6 QA Administrators 2020-03-31 03:30:50 UTC Comment hidden (obsolete)
Comment 7 Timur 2020-03-31 07:14:29 UTC
Content can be obfuscated by replacing each char with X.
But, we can only analyze source .ODT when saving to .DOC or .DOCX and experiencing issues. 
If first save was to .DOC, no help..
If MS .DOC was saved to .ODT in LO, I guess that qualifies for acceptable report.
Footnotes may be suspicious, as in bug 76260.
 
Note: if responding to the whole previous post, no need to quote.
Comment 8 Bjorn Wastvedt 2020-03-31 19:31:53 UTC
(In reply to Timur from comment #7)
> Content can be obfuscated by replacing each char with X.
> But, we can only analyze source .ODT when saving to .DOC or .DOCX and
> experiencing issues. 
> If first save was to .DOC, no help..
> If MS .DOC was saved to .ODT in LO, I guess that qualifies for acceptable
> report.
> Footnotes may be suspicious, as in bug 76260.
>  
> Note: if responding to the whole previous post, no need to quote.

Hi,

Indeed: an MS .DOC was saved to .ODT in LO.

Today, I found a workaround, by selecting all and copying into a new file. I think this avoids the problem. Should I close the bug, or should we pursue it (for the good of LO)?
Comment 9 QA Administrators 2020-04-01 03:36:06 UTC Comment hidden (obsolete)
Comment 10 Timur 2020-04-01 06:41:29 UTC
Only way to have this bug useful is to have source MS DOC (with content obfuscated in MSO) and steps to get ODT (like: copy all and paste).
If you can prepare it and attach, please set Unconfirmed.
Unless, I close.
Comment 11 Bjorn Wastvedt 2020-04-04 12:57:32 UTC
Created attachment 159319 [details]
first document
Comment 12 Bjorn Wastvedt 2020-04-04 12:58:45 UTC
Created attachment 159320 [details]
second document
Comment 13 Bjorn Wastvedt 2020-04-04 13:01:16 UTC
(In reply to Timur from comment #10)
> Only way to have this bug useful is to have source MS DOC (with content
> obfuscated in MSO) and steps to get ODT (like: copy all and paste).
> If you can prepare it and attach, please set Unconfirmed.
> Unless, I close.

OK, I think I've set it up to be reproducible:

1. open two attached documents in libreoffice (both .docx)
2. save as .odt
3. wait and watch memory being taken up (creeps up steadily on my x220 with 8GB ram)

To fix the problem, I just copied all of the text from each file into a new .odt (instead of saving it as an .odt straight from .docx format). So for me, the problem is solved. But it might be helpful for other users who are starting with MSO and moving to ODT to know how to solve this problem.

Let me know if you need other information, and thanks.
Comment 14 Telesto 2020-04-04 15:45:53 UTC
(In reply to Bjorn Wastvedt from comment #12)

I do reproduce the issue with the second file. STR
1. Attachment 159320 [details] (and let it run CPU 25%, memory usage increasing)


Version: 7.0.0.0.alpha0+ (x64)
Build ID: 4501a0ba623ad61c5a4e0b807da2e96f0e4ce82c
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: default; VCL: win; 
Locale: nl-NL (nl_NL); UI-Language: en-US
Calc: CL
Comment 15 Telesto 2020-04-04 15:50:54 UTC
Repro with
Version: 5.4.0.2
Build ID: 2b906d450a44f2bbe506dcd22c51b3fa11dc65fd
CPU threads: 4; OS: Windows 6.2; UI render: default; 
Locale: nl-NL (nl_NL); Calc: CL

No repro with
Versie: 4.4.7.2 
Build ID: f3153a8b245191196a4b6b9abd1d0da16eead600
Locale: nl_NL
Comment 16 Telesto 2020-04-04 15:59:21 UTC
File opening is also too slow, but probably already tracked in bug 76260
Comment 17 Buovjaga 2020-05-03 19:16:22 UTC
Bibisected with linux 6.5 repo to https://git.libreoffice.org/core/+/2e0a32b51681fb356699b4a722f461f55a46b890%5E!/

weld FontNameBox

The used memory increases rapidly upon opening.

On Windows, the memory increase is not as rapid and what is stranger, it is also observed with the oldest commit of 6.5 bibisect repo :(

On Linux I bibisected the document mentioned in this dev list post to the same commit: https://lists.freedesktop.org/archives/libreoffice/2020-May/085014.html

I am unable to repro the problem in any older version or bibisect repo I tried. I only see it in the 6.5/7.0 line.

As the result is a bit strange, I do not want to add Caolán to Cc.

Same result is observed in safe mode.

Arch Linux 64-bit
Version: 7.0.0.0.alpha0+
Build ID: 454a3c945fdc02d706b0a5ad49ca13e0443fa8e5
CPU threads: 8; OS: Linux 5.6; UI render: default; VCL: kf5; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 3 May 2020
Comment 18 Buovjaga 2020-05-04 11:25:22 UTC
Ok, I bibisected with win 6.4 repo and this time the commit seems more reasonable: https://git.libreoffice.org/core/+/a991ad93dcd6807d0eacd11a50c2ae43a2cfb882%5E!/

tdf#121441 improve DOCX footnote import

Adding Cc: to Jan-Marek Glogowski

To repro, simply open attachment 159320 [details] and monitor the memory use. No need to save it or anything.
Comment 19 Jan-Marek Glogowski 2020-05-04 15:16:30 UTC
Just gave it a quick test, with my master, bibisect-6-4 and bibisect-6-5 master checkouts and I can't reproduce. I used a fresh profile and both documents using kf5.

so -env:UserInstallation=file:///tmp/mem /tmp/doc*.docx

LO starts with 100% CPU but completely settles after a few seconds with constant memory and no CPU load. I'm also on Debian Buster, FWIW.

Best bet would be to get some callgrind / perf log. See performance debugging in https://wiki.documentfoundation.org/Development/How_to_debug

While callgrind is much slower, its logs are also much smaller. So if you can't or don't want to debug this yourself, you probably want to run callgrind - instead of perf - for a while (on that HW probably 30m+).

My first guess was "just" an other Writer idle layout problem, but that should normally utilize a full core, which doesn't seem to happen for you, and the memory gets leaked very slow (3% in 15m of 7.5GB), so I'm quite probably wrong. If there was a parsing problem with the new footnote code, it would probably exhaust the memory really fast.

Still in the end this would mean blind fixing for me without a reproducer, so I doubt I can help.
Comment 20 Buovjaga 2020-05-04 15:47:57 UTC
Created attachment 160346 [details]
Perf flamegraph

I let the memory use increase to 1GB

Arch Linux 64-bit
Version: 7.0.0.0.alpha0+
Build ID: 454a3c945fdc02d706b0a5ad49ca13e0443fa8e5
CPU threads: 8; OS: Linux 5.6; UI render: default; VCL: kf5; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 3 May 2020
Comment 21 Telesto 2020-05-08 18:45:03 UTC
(In reply to Buovjaga from comment #17)
> Bibisected with linux 6.5 repo to
> https://git.libreoffice.org/core/+/
> 2e0a32b51681fb356699b4a722f461f55a46b890%5E!/
> 
> weld FontNameBox
> 
> The used memory increases rapidly upon opening.
> 
> On Windows, the memory increase is not as rapid and what is stranger, it is
> also observed with the oldest commit of 6.5 bibisect repo :(
> 
> On Linux I bibisected the document mentioned in this dev list post to the
> same commit:
> https://lists.freedesktop.org/archives/libreoffice/2020-May/085014.html
> 
> I am unable to repro the problem in any older version or bibisect repo I
> tried. I only see it in the 6.5/7.0 line.
> 
> As the result is a bit strange, I do not want to add Caolán to Cc.

Found another bug, bug 132536 (same commit). It surely leaks.. but can't be the root cause.. must be between 5.4 and 4.4.7.2 following my observations.. comment 15
Comment 22 Buovjaga 2020-05-10 12:05:11 UTC
Oh, actually, the memory leak is not infinite. Something like 1,6GB taken at the end. I seems this affects nearly all documents.
Comment 23 Gabor Kelemen (allotropia) 2021-02-15 19:23:02 UTC
These attachments seem to open very quickly now since bug #76260 was fixed.

Checked with:

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: cbcec4425e04e3614a2025b49fdc221216ac51d3
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: hu-HU (hu_HU.UTF-8); UI: en-US
Calc: threaded

*** This bug has been marked as a duplicate of bug 76260 ***