Bug 158605 - 1,2 GB memory usage for a spreadsheet with simply a lot of cells with multi-line contents
Summary: 1,2 GB memory usage for a spreadsheet with simply a lot of cells with multi-l...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
24.2.0.0 alpha1+
Hardware: All All
: medium normal
Assignee: Armin Le Grand
URL:
Whiteboard: target:24.8.0
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Memory
  Show dependency treegraph
 
Reported: 2023-12-08 19:48 UTC by Telesto
Modified: 2024-02-13 10:57 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2023-12-08 19:48:20 UTC
Description:
1,2 GB memory usage for a spreadsheet with simply a lot of cells with multi-line contents

Steps to Reproduce:
1. Open attachment 151324 [details] (bug 125236)
2. Look at the RAM usage


Actual Results:
1,2 GB with master (x64)
840 with Versie: 6.4.0.2 (x86)
500 Mb with 4.4.7.2 (x86)

Expected Results:
600-700 MB or so for x64


Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 24.2.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: a9ad36ae46ff76c0d59b0d170314fdd3a9ee5d35
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL threaded
Comment 1 raal 2023-12-16 19:14:13 UTC
1,2 GB with Version: 24.2.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 0f82e9d42822e627edd1fb3b3c87e1f8a22136a4
CPU threads: 4; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded

800 MB with Version: 7.3.7.2 / LibreOffice Community
Build ID: 30(Build:2)
CPU threads: 4; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: cs-CZ
Ubuntu package version: 1:7.3.7-0ubuntu0.22.04.4
Calc: threaded
Comment 2 raal 2023-12-16 19:53:36 UTC
This seems to have begun at the below commit in bibisect repository/OS linux-64-24.2.
Adding Cc: to Armin Le Grand ; Could you possibly take a look at this one?
Thanks
 73d0152d9321a5f25f79bcea0a9c75dca9e3af20 is the first bad commit
commit 73d0152d9321a5f25f79bcea0a9c75dca9e3af20
Author: Jenkins Build User <tdf@maggie.tdf>
Date:   Tue Nov 21 16:33:24 2023 +0100

    source 2b4cb63a4450aff4582994ca6ac701287da61ddd

159685: Cleanup some ScPatternAttr specific stuff | https://gerrit.libreoffice.org/c/core/+/159685
Comment 3 Armin Le Grand 2024-01-02 15:24:12 UTC
I checked that the ScPatternAttr e-use mechanism works, and it does. Just a handful (3) of them are incarnated and used. That is what I was working on. Also checked that this continues when I change something (e.g. to 'fat'), but also works flawlessly.
Thus I have no clue what extra-objects/stuff needs that memory? It is definitely not related to ScPatternAttr, that I checked.
Comment 4 Armin Le Grand 2024-01-02 15:26:20 UTC
In the meantime I also did https://gerrit.libreoffice.org/c/core/+/157559, but also for ScPatternAttr (which I checked as described). Does anyone have an idea what objects/type of objects is used more - it's not ScPatternAttr...
Comment 5 Armin Le Grand 2024-01-02 15:29:12 UTC
Checked out b5a35f09b3c3ccac26b403e881c799e0d09bf42a and building, last commit before claimed one...
Comment 6 Armin Le Grand 2024-01-02 16:52:23 UTC
Indeed less mem - now adding that change...
Comment 7 Armin Le Grand 2024-01-02 20:20:28 UTC
When using
    ITEM_CLASSIC_MODE=1 instdir/program/soffice
to start the office the mem usage is back to normal, so the reason is that an Item is used many times in identical instances.
The ItemSet rework contains a paradigm change: For runtime and classic reasons LO tries no more to reduce identical items (see commit text in https://gerrit.libreoffice.org/c/core/+/157559).

For usages like this (the example file causes this by all having the same stuff, so not a used-case, more a stress-test) it is worth checking which Item is involved and check for a solution. Will do so...
Comment 8 Armin Le Grand 2024-01-03 10:46:46 UTC
Measurement results with that doc:
4030 -> SvxFontItem, EE_CHAR_FONTINFO: 900928 instances
4046 -> SvxFontItem, EE_CHAR_FONTINFO_CJK: 900928 instances
4047 -> SvxFontItem, EE_CHAR_FONTINFO_CTL: 900928 instances
4018 -> SfxBoolItem, EE_PARA_BULLETSTATE: 844619 instances

These are the highest, all other Items are marginal. Thus we have a problem with SvxFontItem, thinking about possibilities...
Comment 9 Armin Le Grand 2024-01-03 10:55:53 UTC
Also interesting, times using "time OOO_EXIT_POST_STARTUP=1 instdir/program/soffice ~/Downloads/125236\ LO_54638_test.ods":

normal:  user    1m30,840s
classic: user    1m25,753s

This means that the extra runtime to reduce copies of the items slightly pays out in this extreme case by less computing time/mem accesses. Thus it makes sense to think about a general solution for this. The current form of ITEM_CLASSIC_MODE is only to find cases like this, it does no longer fit to the concept and cannot just be activated, that would unfortunately add bad runtime effects in many other cases.
Comment 10 Armin Le Grand 2024-01-03 14:23:27 UTC
I did start to do changes, but came across stuff in the way. I already thought about that and know how t change that what is now possible after the previous steps for ITEM refactor, but have to do that first. So I have to pause this task for now.
Comment 11 Armin Le Grand 2024-01-11 17:13:28 UTC
Coming foward, needed Item change is at https://gerrit.libreoffice.org/c/core/+/161896
Comment 12 Armin Le Grand 2024-01-12 15:03:34 UTC
Needed change is in master, continuing here
Comment 13 Armin Le Grand 2024-01-15 18:01:57 UTC
Still hard s**t happening - I manage to do a good global buffering, but that Surrogate mechanism HITS by even changing ref-counted Items (!) which means it has *no idea* what all it is changing when const_casting and changing these Items - sigh...
Comment 14 Armin Le Grand 2024-01-18 19:55:12 UTC
Have now compared a master pro loading 125236 LO_54638_test.ods, it uses 891,3MB. I have added creating statistics about non-reused/non-shared Items and have added all with a count > 1000 to the new global buffering.
Maybe there is something else using more memory? Checking...
Comment 15 Armin Le Grand 2024-01-18 19:57:34 UTC
NOTE: Time with changes using
  time OOO_EXIT_POST_STARTUP=1 instdir/program/soffice ~/Downloads/125236\ LO_54638_test.ods
is pretty much the same, got a little faster.
Comment 16 Armin Le Grand 2024-01-18 20:25:25 UTC
Added a mechanism to disable that stuff when needed (also good for errors coming up). This time using the debug dev version:

With feature disabled: 1,3GB    time: user    1m30,967s
With feature enabled: 955,2 MiB time: user    1m31,770s

Interesting: disabled shutdown is feelable slower (due to number of Items incarnated),so load/handling before must be faster when time is pretty much identical.
Is there a good way to get more precise mem values, I used just the system monitor of ubuntu...?
Comment 17 Commit Notification 2024-01-23 17:56:30 UTC
Armin Le Grand (allotropia) committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/063781f4a94e3960a2cb40d1981c0e0ef9a73153

tdf#158605 Add global SfxPoolItem re-use

It will be available in 24.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Armin Le Grand 2024-01-29 16:18:48 UTC
Down to 955Mb, but not to the 860Mb mentioned. Using the flag SVL_SHARE_ITEMS_GLOBALLY_INSTANTLY=1 means that for the Items all is done as before the mentioned change (except that sharing is extended and global now). If memory usage is not going to old/before numbers this then has not to do with Item sharing/that change anymore - I do not know what else might cause that.

From the Item sharing POV there is not more I can do here. I also have no hints what else might have caused that, but probably something else has happened in-between. Comment 7 shows that Item sharing was involved, but that is back...
Comment 19 kabilo 2024-02-13 10:56:35 UTC
This file resaved in xlsx format requires less memory (around 750 MB). Opening the file is faster against the ods format.
Excel shows much less memory consumption when the file is opened (only around 70 MB) and the movement is also much more agile.
Unfortunately.