Bug 129163 - Memory leak in createEnumeration
Summary: Memory leak in createEnumeration
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0 all versions
Hardware: All All
: high normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.5.0 target:7.4.2
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Memory
  Show dependency treegraph
 
Reported: 2019-12-03 16:51 UTC by Jan Rheinländer
Modified: 2022-09-20 14:27 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
test createnumeration (21.07 KB, application/vnd.oasis.opendocument.text)
2019-12-03 18:26 UTC, Oliver Brinzing
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Rheinländer 2019-12-03 16:51:31 UTC
Description:
There seems to be a memory leak in XEnumerationAccess::createEnumeration(). When I run the following BASIC code on a Writer document, I can watch the memory usage go up:

Sub Main
    for i = 1 to 100000
        text = ThisComponent.getText()
       
        paragraphs = text.createEnumeration()
       
        while (paragraphs.hasMoreElements())
            paragraph = paragraphs.nextElement()
           
            parenum = paragraph.createEnumeration()
        wend
    next
End Sub

I discovered the problem first in a C++ Extension where it seems to be a
lot worse (a few hundred iterations are enough). But it also seems to
depend on how many paragraphs the document has. A large document will
fill memory faster than a small document.

Steps to Reproduce:
Run the following BASIC code on a Writer document and watch the memory usage go up:

Sub Main
    for i = 1 to 100000
        text = ThisComponent.getText()
       
        paragraphs = text.createEnumeration()
       
        while (paragraphs.hasMoreElements())
            paragraph = paragraphs.nextElement()
           
            parenum = paragraph.createEnumeration()
        wend
    next
End Sub

Actual Results:
The final result is a crash because of unavailable memory.

Expected Results:
Not used ever increasing memory.


Reproducible: Always


User Profile Reset: Yes



Additional Info:

Version: 6.3.3.2
Build-ID: 1:6.3.3-0ubuntu0.18.04.1~lo1
CPU-Threads: 4; BS: Linux 4.15; UI-Render: Standard; VCL: kde5;

and also:

Version: 6.3.2.2 (x64)
Build-ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU-Threads: 2; BS: Windows 6.3; UI-Render: Standard; VCL: win;
Comment 1 Oliver Brinzing 2019-12-03 18:25:41 UTC
reproducible with:

Version: 5.4.7.2 (x64)
Build-ID: c838ef25c16710f8838b1faec480ebba495259d0
CPU-Threads: 4; BS: Windows 6.19; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: single

Version: 6.3.4.1 (x64)
Build-ID: a21169d87339dfa44546f33d6d159e89881e9d92
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: 

Version: 6.5.0.0.alpha0+ (x64)
Build ID: 00262b08984fb2fb91b760d588851bd47ae4d3ac
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: threaded

but *not* reproducible with:

Version: 4.4.7.2
Build-ID: f3153a8b245191196a4b6b9abd1d0da16eead600
Gebietsschema: de_DE
Comment 2 Oliver Brinzing 2019-12-03 18:26:05 UTC
Created attachment 156277 [details]
test createnumeration
Comment 3 Oliver Brinzing 2019-12-03 18:49:33 UTC
seems to have started with:

https://gerrit.libreoffice.org/plugins/gitiles/core/+/96898cd49830333d752b9aa56fe91a8e21c9dca8

gerrit.libreoffice.org / core / 96898cd49830333d752b9aa56fe91a8e21c9dca8
commit 96898cd49830333d752b9aa56fe91a8e21c9dca8
[log]
author
Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Thu May 21 15:31:32 2015 +0200
committer
Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Tue May 26 00:51:10 2015 +0200
tree 351f559f6a1cf1fe251aa009418b7c556db231f6
parent 99614f6a9a738989cca82c8bbd4532fc2d35c1cc [diff]

new unocrsrs for SwXTextPortionEnumeration

Change-Id: I5c509d3e65a92824090930d10849b9b1b430971f
sw/source/core/inc/unoport.hxx[diff]
sw/source/core/unocore/unoportenum.cxx[diff]
2 files changed

/cygdrive/d/sources/bibisect/bibisect-win32-5.0
$ git bisect good 80dca436d1a36ab1eb3f33469fd7ae60576ff367 is the first bad commit
commit 80dca436d1a36ab1eb3f33469fd7ae60576ff367
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Sat Jun 6 02:22:56 2015 -0500

    source 96898cd49830333d752b9aa56fe91a8e21c9dca8

    source 96898cd49830333d752b9aa56fe91a8e21c9dca8

:040000 040000 d5d87b24aa577c354878db8bf86ce4b375972803 f9026cc34a0d4a46c67a2e29013b279f4c98c43a M      instdir

/cygdrive/d/sources/bibisect/bibisect-win32-5.0
$ git bisect log
# bad: [b7988d11e5d3751a4b366b2bfc9048f7a30e8526] source 87ac0b1e75a880a68ecb748bd4b34ae5a3d2ae98
# good: [f449493ae11ac76cc7396bddeaa624a60c565936] source 57d6b92b69a31260dea0d84fcd1fc5866ada7adb
git bisect start 'master' 'oldest'
# good: [66e2ae767eb4bb83444e3d03bcb90adcbe6d4991] source 5a308b1239a09417507b0d05090ff2d3418d5133
git bisect good 66e2ae767eb4bb83444e3d03bcb90adcbe6d4991
# good: [c51237da468f7026112580cfb26a732ce39f523d] source 103bf75921e069d1c078c0ef30b94b8f91920877
git bisect good c51237da468f7026112580cfb26a732ce39f523d
# good: [506aebdebff0cb9a6b9a21b4cc1420ac30da809c] source 741d9990bf9d9dfcba1166a12ffb1d846c912181
git bisect good 506aebdebff0cb9a6b9a21b4cc1420ac30da809c
# bad: [7dc37603af81ce291598745d95748b9b95154852] source c642425fd372ef219a683b5198600746fb7f0c3c
git bisect bad 7dc37603af81ce291598745d95748b9b95154852
# good: [0ea5bbc13f5d1ddf3a6f8e6b17e8bd5d5b67cba8] source 1d2d037b4defa775b164880b56732af2a837f254
git bisect good 0ea5bbc13f5d1ddf3a6f8e6b17e8bd5d5b67cba8
# good: [347ba99c1ec42324c50b77c1d0d4f501bdf118ea] source 8a9758ed05cb5597df9ad56fefe146f1feff41fa
git bisect good 347ba99c1ec42324c50b77c1d0d4f501bdf118ea
# good: [4ff1bdaf1755913bfb992874449d3dcaed35a821] source 4de86ac0c62b446426136b620cfd65d088c51cd8
git bisect good 4ff1bdaf1755913bfb992874449d3dcaed35a821
# good: [f319760e22697f20c7e5e19eb4050b30341d81bd] source 6a79fe2b0bc0101b1d279b22f3cab7f12538c109
git bisect good f319760e22697f20c7e5e19eb4050b30341d81bd
# bad: [bb8dc2e6f3b1dd1b5029ad879516e8371fd653ba] source 6ab4c4f9c7b12c6058b08e44d35eb8b386348c55
git bisect bad bb8dc2e6f3b1dd1b5029ad879516e8371fd653ba
# good: [2970f29b0ee7e9988ffbb183f2e6bb156e374409] source 4020f9bbd92becd3662cdc3b24ad70b370307e5e
git bisect good 2970f29b0ee7e9988ffbb183f2e6bb156e374409
# bad: [80dca436d1a36ab1eb3f33469fd7ae60576ff367] source 96898cd49830333d752b9aa56fe91a8e21c9dca8
git bisect bad 80dca436d1a36ab1eb3f33469fd7ae60576ff367
# good: [fcdc84d4a64c70527b739361d472ff01af9743e4] source c844a15c7d39ee1c60d2fbf969d502f94a0cdfff
git bisect good fcdc84d4a64c70527b739361d472ff01af9743e4
# good: [5d5ba7d3d868006e2a9c183d742f033858549e83] source 99614f6a9a738989cca82c8bbd4532fc2d35c1cc
git bisect good 5d5ba7d3d868006e2a9c183d742f033858549e83
# first bad commit: [80dca436d1a36ab1eb3f33469fd7ae60576ff367] source 96898cd49830333d752b9aa56fe91a8e21c9dca8
Comment 4 Justin L 2021-03-23 15:27:30 UTC
This seems like a fairly serious and well-written report. Too bad no one has acted on it.

The leak was slow for me, but still seemed to repro for 7.2+
Comment 5 Michael Warner 2021-04-03 14:36:41 UTC
So far in the course of investigating this bug, I have discovered that every time paragraph.createEnumeration() is called, it creates an UNO cursor, a shared_ptr to which is used and then properly destroyed when paraenum goes out of scope. However, when the cursor is created, a weak_ptr to it is stored in the SwDoc mvUnoCursorTable. This table is not cleaned up at any time during the Basic script execution nor when it completes, so it just keeps growing. Even if I lower the number of iterations from 100000 to merely 1000, in the example document it still results in hundreds of thousands of entries being added to the  mvUnoCursorTable vector. Solution to this problem is to clean up the table more often. Cleaning the table takes time, so how often to do it is a tradeoff between speed and memory usage. 

Growth of the mvUnoCursorTable alone isn't enough to explain the amount and rate that memory usage is increasing while executing the script. So, there is something else going on. I'll keep looking for that. 

One other thing I have found is that the mvUnoCursorTable is being directly accessed in multiple places without being protected by a mutex, so it may be subject to a race condition. In all but one case, it's easy enough to put a mutex around it. I'm not yet sure what to do about that particular case, maybe file a separate bug for it.
Comment 6 Björn Michaelsen 2022-09-11 21:38:30 UTC
possible fix: https://gerrit.libreoffice.org/c/core/+/139778
Comment 7 Commit Notification 2022-09-13 12:11:54 UTC
Bjoern Michaelsen committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/769cefd9c49652f28ba58cd371bc60b9e1bd5bd0

tdf#129163: GC cursor table at the end of the life of an portion enumeration

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2022-09-13 18:55:24 UTC
Bjoern Michaelsen committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/51c558930a261a5bd63569965fe360f316b9f3f4

tdf#129163: GC cursor table at the end of the life of an portion enumeration

It will be available in 7.4.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Justin L 2022-09-20 14:27:35 UTC
Seems fixed to me. No steady RAM usage climb anymore. Thanks Bjoern.