Bug 165481 - multi GB memory leak caused by "Save AutoRecovery information" since LibreOffice Calc 24.2.7 on Ubuntu 24.04
Summary: multi GB memory leak caused by "Save AutoRecovery information" since LibreOff...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
24.8.4.2 release
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Memory
  Show dependency treegraph
 
Reported: 2025-02-27 06:59 UTC by satphil
Modified: 2025-06-12 23:17 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
sample 10 worksheet 0.5MB spreadsheet - autosave leaks 200 MB memory (541.31 KB, application/octet-stream)
2025-02-27 08:04 UTC, satphil
Details
Snapshot of LibreOffice running in the macOS Intruments application's "Leaks" tool (1.81 MB, image/png)
2025-06-04 19:37 UTC, Patrick (volunteer)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description satphil 2025-02-27 06:59:03 UTC
Description:
FILESAVE. Back in the good old days of Ubuntu 22.04 running LibreOffice Calc 1:7.3.7-0ubuntu0.22.04.7 on amd64, I could use "Save AutoRecovery information" to autosave a 5MB 277 sheet spreadsheet every 15 minutes without disruption. Since upgrading to Ubuntu 24.04 LO Calc 4:24.2.7 on amd64 and Ubuntu 24.10 LO Calc 4:24.8.4.2 on aarch64, I experience massive 5 GB memory utilisation blowouts in the soffice.bin process during autosave causing painful delays of up to a minute. This blowout occurs even if the only change is to bold text in one cell. The delay reduces as I delete worksheets from my spreadsheet. e.g. a  10 worksheet sample sees a 200MB memory blowout.

Steps to Reproduce:
1. Run gnome-monitor to monitor memory utilisation and ps -ely | | awk '/soffice.bin$/{print "RSS: "$8" / SZ: "$9}' e.g. before calc 3.5GB, 453K/397K
2. Open sample 0.5MB spreadsheet; Tools -> Options: Load/Save: General: Save: tick and set "Save AutoRecovery information every" 1 "minute." Start 1 minute timer e.g. on your phone, then bold text a cell. Memory usage stays much same as above
3. After the minute is up, wait for autosave (orange bar runs along bottom of Calc window) and then re-check memory: Now 3.7GB 650K/495K

Actual Results:
In this case of a 10 worksheet 0.5MB sample, you see a 200MB memory blow-out.
By the time you get to 277 worksheet 4.8MB spreadsheet, the memory blows out from 4.4 to 9.1 GB, and soffice.bin process memory blows out:
After change but before autosave:
RSS:  747012 SZ:  470429
After autsave:
RSS: 5356176 SZ: 1752282

Memory is not released until you full exit all LibreOffice windows.
In my case I had 16GB physical memory to hold the blowout but even so it took over a minute to autosave, severely impacting the usability of LO Calc. It's foreseeable that with a more typical 8 GB laptop, you're going to be swapping to disk and really hang the user for minutes.

Expected Results:
I expected the performance I saw in LibreOffice Calc 7.3.7 on Ubuntu 22.04.07 which was instantaneous autosaves with no perceptible delays.


Reproducible: Always


User Profile Reset: Yes

Additional Info:
The problem occurs with both .ods and .xls versions of the file and on both Calc 24.2.7 on Ubuntu 24.04 amd64 and Calc 24.8.4.2 on Ubuntu 24.10 aarch64. And with autosave set to 1 minute and 10 minutes.

Version: 24.8.4.2 (AARCH64) / LibreOffice Community
Build ID: 480(Build:2)
CPU threads: 8; OS: Linux (misparsed version); UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Ubuntu package version: 4:24.8.4-0ubuntu0.24.10.2
Calc: threaded
Comment 1 satphil 2025-02-27 08:04:14 UTC
Created attachment 199495 [details]
sample 10 worksheet 0.5MB spreadsheet - autosave leaks 200 MB memory
Comment 2 Andrew Kopf 2025-05-31 00:14:36 UTC
Hello satphil@gmail.com,
I can see the additional memory use with autosave on, but was unable to create a saving slowdown. I am leaving this unconfirmed, as I don't know what memory usage autosave should reasonably use. I saw an additional 1.1 gigs of memory with both versions below, with 88 worksheets. ~320mb with autosave off vs 1.4gb with autosave on. 


Version: 25.8.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 19f3b72f34c487dc97d582712d21734a7e055fd5
CPU threads: 22; OS: Windows 11 X86_64 (build 26100); UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded

and 

Version: 25.2.3.2 (X86_64) / LibreOffice Community
Build ID: bbb074479178df812d175f709636b368952c2ce3
CPU threads: 22; OS: Windows 11 X86_64 (10.0 build 26100); UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded
Comment 3 satphil 2025-05-31 04:53:49 UTC
Thanks for getting back to me Andrew. 

The saving slowdown is caused by, I assume, your average user not having (in my case) 5 GB of memory sitting around unused, and so the memory blowout causes 5 GB 
of memory in use to be written to swap on disk to make room for the blowout. Moreover this memory blowout is retained until you exit all LO processes and keeps getting pulled into active memory for every autosave.

The fact this didn't happen in LibreOffice 7.3.7 indicates it really isn't necessary to need 5 GB extra memory to autosave a 5 MB xls (i.e. uncompressed) file. 

I'm guessing your 88 worksheet test file was about 1.5 MB in size (as an .xls), but it needed an extra 1,100 MB to autosave - that's a 700:1 ratio! I'm struggling to think of an algorithm that could be so profligate (memory-vendor algorithms excepted): "I'm going to write this byte, but let me make 700 copies first, just to be sure".

Can you kindly reconsider the confirmation status?
Comment 4 Andrew Kopf 2025-06-01 00:11:07 UTC
Hey satphil@gmail.com, 
I am sorry if I indicated it wasn't a bug; I was saying I wasn't sure I had enough experience and knowledge to judge the situation correctly, as I have only been doing this for a few weeks. I wanted to wait for a tester with more experience for a second opinion. I have a meeting on Wednesday and will get back to you then.  I do know that Excel hates this file.  When I tried to open the sample file I expanded to 88 worksheets in Excel, it took about 20 minutes to get to an editable state, and Excel mentioned unreadable content. 

Sorry for the delay.
Comment 5 Charles Williams 2025-06-01 10:53:40 UTC
On a Mac:

Version: 25.2.3.2 (AARCH64) / LibreOffice Community
Build ID: bbb074479178df812d175f709636b368952c2ce3
CPU threads: 8; OS: macOS 15.5; UI render: default; VCL: osx
Locale: en-GB (en_GB.UTF-8); UI: en-US
Calc: threaded

and monitoring memory usage with Activity Monitor:

When the example spreadsheet was initially opened LO was using approximately 660MB. When the autosave triggers, as described above, this jumps by about 200MB to 860MB but, assuming there is no further activity, it drops back down to ca 660MB within a minute or so.

I'm not sure that this confirms a 'bug'. Memory used by LO has always seemed a bit 'peaky/uneconomical' to me, e.g. when compared with single-platform Mac Applications. With the example spreadsheet used here, simply changing the selected cell with a single mouse click causes the memory utilisation to transiently increase by up to 100MB before dropping back down again over the next minute or so.
Comment 6 satphil 2025-06-04 04:36:00 UTC
Thanks for following up on this, Charles.

I'd make a couple of points. 

This didn't happen with LO 7.3.7 and earlier so it would appear autosaving doesn't inherently require vast sweeps of vacant memory.

That 200 MB blowout you saw was with my sample 0.5 MB file. A 400:1 ratio for auto-saving something that takes no extra memory if doing a regular save.

"A hundred megs here, a hundred megs there", becomes gigs with non-trivial files (5 GB for my 5 MB file - 1000:1 ratio). 

I find it hard to believe this is acceptable memory utilisation.

Can I suggest that what is not so much a leak as a gusher could do with further investigation.
Comment 7 Patrick (volunteer) 2025-06-04 19:37:59 UTC
Created attachment 201102 [details]
Snapshot of LibreOffice running in the macOS Intruments application's "Leaks" tool

I was able to reproduce this bug in my local macOS master build while running the macOS Instruments application's "Leaks" tool. Instruments reported no memory leaks but the net memory allocations do go up by at least 100 MB and Instruments noted that a lot of that is allocated within ScColContainer::resize().

I am not familiar with the Calc code so who knows what necessary work ScColContainer::resize() does, but I did add an fprintf() (see my debug patch below) and during autosave I get the following output:

  Old vs new column size: 257 16384

So it looks like during autosave, LibreOffice is allocating memory for each of the empty columns in each sheet. I also get the same output when I open the document and immediately save it.

Maybe a Calc developer might be able to check if this expansion of columns is necessary when saving?:

diff --git a/sc/source/core/data/colcontainer.cxx b/sc/source/core/data/colcontainer.cxx
index a0a9d845772f..6961403673dd 100644
--- a/sc/source/core/data/colcontainer.cxx
+++ b/sc/source/core/data/colcontainer.cxx
@@ -48,6 +48,7 @@ void ScColContainer::resize( ScSheetLimits const & rSheetLimits, const size_t aN
 {
     size_t aOldColSize = aCols.size();
     aCols.resize( aNewColSize );
+fprintf( stderr, "Old vs new column size: %lu %lu\n", aOldColSize, aNewColSize );
     for ( size_t nCol = aOldColSize; nCol < aNewColSize; ++nCol )
         aCols[nCol].reset(new ScColumn(rSheetLimits));
 }
Comment 8 Patrick (volunteer) 2025-06-04 19:41:18 UTC
(In reply to Patrick (volunteer) from comment #7)
> I am not familiar with the Calc code so who knows what necessary work
> ScColContainer::resize() does, but I did add an fprintf() (see my debug
> patch below) and during autosave I get the following output:
> 
>   Old vs new column size: 257 16384

I wonder if this is a side effect of increasing the maximum number of columns in a Calc sheet. My memory is hazy, but in the recent past I remember Calc increasing the maximum columns from 8192 to 16384.
Comment 9 satphil 2025-06-05 03:47:22 UTC
Thanks Patrick for investigating this, much appreciated.

Allocating memory for 16,000 phantom columns does seem puzzling.
Comment 10 Andrew Kopf 2025-06-05 07:03:15 UTC
Hello Satphil,
I bibisected the bug per a suggestion from my meeting.

https://git.libreoffice.org/core/+/b8720d1e1f0842d52f1830c48ef7551b1868ae6f

fix ScTable::GetLastChangedCol() for unallocated columns

Column flags and widths are stored separately from ScColumn data,
and so don't depend on the allocated column count.

This appears to be the first instance of this behavior showing up and may be why it is using so many resources.

----------------------------

Note that this is before the following fix was put into place.  You will need to wait at least 10 minutes for the behavior to exhibit itself.
 
https://git.libreoffice.org/core/+/aeb8a0076cd5ec2836b3dfc1adffcced432f995f%5E%21/#F0

My first attempt to bibisect the bug led to this commit, as I had set autosave to 1 minute to save time.

Autorecovery information was saved in every 10 minutes,
regardless of the "Save Autorecovery information every" setting.
Comment 11 satphil 2025-06-06 05:05:44 UTC
Great work, Andrew. thanks for bisecting this bug.

so if I understand it, and forgive my extreme ignorance if I've got it wrong, the rationale for commit 

https://git.libreoffice.org/core/+/b8720d1e1f0842d52f1830c48ef7551b1868ae6f

is "Column flags and widths are stored separately from ScColumn data, and so don't depend on allocated columns count."

So I guess the commit allocates memory for column flags and widths up to the max 16,384 columns per worksheet (rather than up to the last column with data) even if there is no data in these columns. 

But my question is why would you bother with flag and width data if there's no data in these columns? Seems a *lot* of memory allocation for no obvious benefit.
Comment 12 Noel Grandin 2025-06-12 06:36:16 UTC
(In reply to Patrick (volunteer) from comment #7)
> 
> I am not familiar with the Calc code so who knows what necessary work
> ScColContainer::resize() does, but I did add an fprintf() (see my debug
> patch below) and during autosave I get the following output:
> 
>   Old vs new column size: 257 16384
> 

This should not be happening, something is accidentally triggering an unnecessary resize. This is likely the real problem.


(In reply to satphil from comment #11)
> But my question is why would you bother with flag and width data if there's
> no data in these columns? Seems a *lot* of memory allocation for no obvious
> benefit.

The "width" of those two data structures is largely irrevelant, because they are CompressedArray, so they compress repeating data, which means the "extra" columns only cost a single entry.
Comment 13 Patrick (volunteer) 2025-06-12 23:17:42 UTC
(In reply to Noel Grandin from comment #12)
> The "width" of those two data structures is largely irrevelant, because they
> are CompressedArray, so they compress repeating data, which means the
> "extra" columns only cost a single entry.

From what I can see in the bottom right corner of attachment #201102 [details], it looks like it is the ScColumn class that is using up all that space. I am guessing that the 71.36 MB of memory for the "operator new" block at the bottom of the stack is the "aCols[nCol].reset(new ScColumn(rSheetLimits));" line in my debug patch in comment #7. Maybe ScColumn has a bunch of memory-using objects hanging off of it?

Don't know if this is possible, but maybe using a single shared ScColumn "empty column" instance could significantly reduce the ScColumn memory usage. Don't know if two columns can safely share the same ScColumn instance though.