Bug 155232 - LibreOffice and LanguageTool extension: LibreOffice doesn't free RAM for special interface XFlatParagraph
Summary: LibreOffice and LanguageTool extension: LibreOffice doesn't free RAM for spec...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.5.3.2 release
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:7.6.0 target:7.5.4
Keywords:
Depends on:
Blocks:
 
Reported: 2023-05-10 12:13 UTC by Marco A.G.Pinto
Modified: 2023-06-21 05:28 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
StarterProject.oxt (23.67 KB, application/vnd.openofficeorg.extension)
2023-05-11 16:13 UTC, Mike Kaganski
Details
Source code of the extension from comment 1 (14.38 KB, application/x-zip-compressed)
2023-05-12 09:00 UTC, Mike Kaganski
Details
A reproducer for ever-growing memory consumption (146.54 KB, application/vnd.oasis.opendocument.text)
2023-05-12 09:06 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marco A.G.Pinto 2023-05-10 12:13:01 UTC
Hello!

Here is the comment of the developer of the LanguageTool extension for LibreOffice:
https://github.com/languagetool-org/languagetool/issues/8012#issuecomment-1540640790

Can some sort of fix be implemented?

Thanks!
Comment 1 Mike Kaganski 2023-05-11 16:13:13 UTC
Created attachment 187207 [details]
StarterProject.oxt

Here is a comment by a LanguageTool developer:

===

I built a dummy proofreader as LO extension. It does no real proof and no marking. It gets all XFlatParagraphs using the method "getNextPara" to get an initial paragraph and after that "getParaBefore" and "getParaAfter" from XFlatParagraphIterator and stores it in an ArrayList. The whole procedure is running in a loop. After the whole document is stored as XFlatparagraphs, the list is emptied and the XflatParagraphs are called and stored again. The loop runs 10000 times per paragraph.
You should disable or remove all grammar checkers from your LO installation and install the OXT after that. After restart of LO, load a document containing some hundred paragraphs.
In my tests the java heap space doesn't exceed 800 MB while the used memory of LO grows steady.
Here the file
Comment 2 Mike Kaganski 2023-05-12 09:00:08 UTC
Created attachment 187220 [details]
Source code of the extension from comment 1
Comment 3 Mike Kaganski 2023-05-12 09:06:18 UTC
Created attachment 187221 [details]
A reproducer for ever-growing memory consumption

Repro with the attached document.

It has several hundred lorem ipsum paragraphs, and a macro to produce the problem:

===

sub oneLoop(iter)
    start = iter.getNextPara()
    Dim paragraphs(0 to 25350) ' Just the number of paragraphs in this document
    dim para as object
    para = start
    n = 0
    do while not IsNull(para)
      paragraphs(n) = para
      n = n + 1
      para = iter.getParaBefore(para)
    loop
    para = iter.getParaAfter(start)
    do while not IsNull(para)
      paragraphs(n) = para
      n = n + 1
      para = iter.getParaAfter(para)
    loop
end sub

sub testOOM
  doc = thisComponent
  iter = doc.getFlatParagraphIterator(com.sun.star.text.TextMarkupType.PROOFREADING, true)
  for i = 0 to 1000000
    oneLoop(iter)
  next i
end sub

===

Running 'testOOM' would result in slowly, but steadily growing memory consumption, and will OOM eventually.

The problem is the m_aFlatParaList member in the implementation of XFlatParagraphIterator [1]. It was introduced in the initial commits that introduced the API: [2] [3]. It is not used elsewhere, and its use to keep references to objects managed by UNO refcounting mechanism is questionable.

The easyhack is to just drop the list.

[1] https://git.libreoffice.org/core/+/master/sw/source/core/inc/unoflatpara.hxx#127
[2] https://git.libreoffice.org/core/+/677eba2322d2753951024c688d59553182bf2fbd%5E%21/
[3] https://git.libreoffice.org/core/+/ba76230f6f677774b0d333da946a7e487acbeb0b%5E%21/
Comment 4 Mike Kaganski 2023-05-12 20:12:08 UTC
I see it's a bit pressing for the participants; let's not wait an easyhacker.

https://gerrit.libreoffice.org/c/core/+/151712
Comment 5 Commit Notification 2023-05-13 05:23:50 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/a7ce722b476c4bb0c9a113ae0c2759181edfe48f

tdf#155232: drop m_aFlatParaList from SwXFlatParagraphIterator

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2023-05-15 09:02:56 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/193c0f20fc1f8f836ebdabac0d8a1065162653a7

tdf#155232: drop m_aFlatParaList from SwXFlatParagraphIterator

It will be available in 7.5.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.