Bug 148094 - Calc Hangs and uses ever increasing amount of memory with lots of MATCH(1, ("X"=range) * ("Y"=range), 0) EDITING/OPENING
Summary: Calc Hangs and uses ever increasing amount of memory with lots of MATCH(1, ("...
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.1.8.1 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf
Depends on:
Blocks: Calc-Function
  Show dependency treegraph
 
Reported: 2022-03-20 08:59 UTC by Peter
Modified: 2024-07-13 18:01 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Demonstration Spreadsheet (1.50 MB, text/plain)
2022-03-20 09:01 UTC, Peter
Details
Demonstration Spreadsheet (877.04 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2022-03-20 09:03 UTC, Peter
Details
Apple Crash Report - Demo Spreadsheet Memory Exponential increase (1.50 MB, text/plain)
2022-03-20 09:04 UTC, Peter
Details
Apple Crash Report - 2.5 MB Spreadsheet (2.23 MB, text/plain)
2022-03-20 09:06 UTC, Peter
Details
Screen Shot of DEMO spreadsheet Saving (516.19 KB, image/png)
2022-03-20 09:07 UTC, Peter
Details
Screenshot of hang while calculating lots of INDEX(MATCH) formulas on fill (274.96 KB, image/png)
2022-03-20 09:09 UTC, Peter
Details
Screenshot of RAM stats during LibreOffice Hang + Memory Leak (61.46 KB, image/png)
2022-03-20 09:10 UTC, Peter
Details
When I tried to archive user profile to upload (111.53 KB, image/png)
2022-03-20 09:11 UTC, Peter
Details
RAM Statistics -- After 2.5MB spreadsheet memory leak left for a while. The Inverse spike is the moment I force quit Libreoffice. (47.75 KB, image/png)
2022-03-20 09:12 UTC, Peter
Details
RAM Statistc - A few force quits in regular succession (61.24 KB, image/png)
2022-03-20 09:15 UTC, Peter
Details
Flamegraph (415.91 KB, image/svg+xml)
2022-04-01 07:13 UTC, Julien Nabet
Details
Sample during hang using the Activity Monitor application (636.16 KB, text/plain)
2024-07-10 10:28 UTC, Patrick (volunteer)
Details
CPU Profile during hang using Xcode's Instruments application (2.26 MB, image/png)
2024-07-10 17:43 UTC, Patrick (volunteer)
Details
Allocations before and during hang using Xcode's Instruments application (1.94 MB, image/png)
2024-07-12 13:52 UTC, Patrick (volunteer)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter 2022-03-20 08:59:21 UTC
Description:
I have a 2.5MB spreadsheet with which uses lots of INDEX(range,MATCH(1, ("X"=range) * ("Y"=range), 0),1) formulas (to do a lookup based on two criteria). 

If you open this spreadsheet, or randomly afterwards LibreOffice hangs and starts using experiential amounts of RAM. I have 8GM ram and if I don't force quite before it gets to 5/6GB just for libreoffice the whole complete locks up.

Open with autocalculate off isn't possible (and having auto calculate on/off doesn't seem to matter). If you open the same file in MS Excel 2011 it opens fine.

I've created a simplified version of this spreadsheet which is smaller, and only contains the formulas above (see attached). Although this doesn't crash at open very often. It does fairly regularly exhibit the same behaviour where it hangs and start using ever more RAM as the bigger spreadsheet does. I've not been able to deduce what causes this.

There is also performance issue with calculating spreadsheet with such formulas. A sheet with 400 such formulas (with Multithreading enabler take a good few minutes too calculate (even with multithreading on it still only seems to use one core's worth of CPU). 

And odd other performances issues, like it taking 10 seconds to add a new sheet (which is doesn't happen on other spreadsheets). Saving taking 30secs sometimes, but weirdly not always (that's on MATCH()_CRASH_DEMO.xlsx with files size of 200Kb or bigger).

OS Version:      Mac OS X 10.14.6 (Build 18G9323)
Architecture:    x86_64

Command:         soffice
Path:            /Applications/Downloaded/LibreOffice.app/Contents/MacOS/soffice
Identifier:      org.libreoffice.script
Version:         7.1.8.1 (7.1.8.1)
PID:             60566

Event:           hang

Hardware model:  MacBookPro9,2
Active cpus:     4
RAM:             8GB
Hard Drive:      960GB ScanDisk Pro SSD (SanDisk SDSSDXPS960G)
Graphics:        Intel HD Graphics 4000

Tried turning on OpenCL --> Doesn't seem to be available with computer hardware.
Tried turning off Multithreading --> Doesn't seem to make any difference

I've run out of time to keep compiling this bug report. I think I've produced replicable example I could upload, and then I re-open libreoffice in safe mode and struggled to repeat the expolnencail increase in memory use with DEMO spreadsheet. I've definetly crashed full 2.5MB spreadsheet in safe mode, but so far not the DEMO on I've attached. I'll keep using it in safe mode and see how long that last. I can upload my user profile if that's helpful.



Steps to Reproduce:
Reproducing the slow performance on INDEX(MATCH) formulas is very easy:
1. Open spreadsheet attached
2. Go to first sheet 'INDEX(MATCH)'
3. Drag down another 100 rows.

Reproducing the memory fault is harder to pin down.
1. Open spreadsheet attached
2. If that's not enough then trying to work on it for a few minutes is usually sufficient
3.

Actual Results:
Hang + Memory Leak a seemingly random intervals. 

Hang without memory increase for long time whilst calculating.


Expected Results:
Calculations that take significant amount of time should not lock up UI, or cause application to be un-responsive. They should just stop use editing spreadsheet and indicate some how that recalculation is in progress.

Under no circumstance should LibreOffice start using infinitely more RAM until computer locks up.


Reproducible: Sometimes


User Profile Reset: No



Additional Info:
Not sure if either of these bugs relate to issues I'm having:
https://bugs.documentfoundation.org/show_bug.cgi?id=139444
https://bugs.documentfoundation.org/show_bug.cgi?id=144142


Due to this bug I can't run latest version:
https://bugs.documentfoundation.org/show_bug.cgi?id=148069
Comment 1 Peter 2022-03-20 09:01:17 UTC
Created attachment 178981 [details]
Demonstration Spreadsheet

This spreadsheet contains the INDEX(range,MATCH(1, ("X"=range) * ("Y"=range), 0),1) formulas (to do a lookup based on two criteria).
Comment 2 Peter 2022-03-20 09:03:11 UTC
Created attachment 178982 [details]
Demonstration Spreadsheet

Spreadsheet with INDEX(range,MATCH(1, ("X"=range) * ("Y"=range), 0),1) formulas (to do a lookup based on two criteria).
Comment 3 Peter 2022-03-20 09:04:49 UTC
Created attachment 178983 [details]
Apple Crash Report - Demo Spreadsheet Memory Exponential increase

In normal (i.e. not safe mode) after DEMO spreadsheet (only think open) hang and start using ever increasing amounts of RAM
Comment 4 Peter 2022-03-20 09:06:12 UTC
Created attachment 178984 [details]
Apple Crash Report - 2.5 MB Spreadsheet

In normal (i.e. not safe mode) after 2.5MB spreadsheet hang and started using ever increasing amounts of RAM
Comment 5 Peter 2022-03-20 09:07:33 UTC
Created attachment 178985 [details]
Screen Shot of DEMO spreadsheet Saving

Sometimes saving task 30 seconds and libreoffice stops responding.
Comment 6 Peter 2022-03-20 09:09:02 UTC
Created attachment 178986 [details]
Screenshot of hang while calculating lots of INDEX(MATCH) formulas on fill
Comment 7 Peter 2022-03-20 09:10:55 UTC
Created attachment 178987 [details]
Screenshot of RAM stats during LibreOffice Hang + Memory Leak

This was probably 10-20 seconds after it stopped responding, and if I'd have left another 30 computer would have locked up.
Comment 8 Peter 2022-03-20 09:11:45 UTC
Created attachment 178988 [details]
When I tried to archive user profile to upload
Comment 9 Peter 2022-03-20 09:12:40 UTC
Created attachment 178989 [details]
RAM Statistics -- After 2.5MB spreadsheet memory leak left for a while.

The Inverse spike is the moment I force quit Libreoffice.
Comment 10 Peter 2022-03-20 09:15:03 UTC
Created attachment 178990 [details]
RAM Statistc - A few force quits in regular succession

Note: compared of other screenshot I'd closed a few other applications so more Ram was available.
Comment 11 Xisco Faulí 2022-03-21 11:07:07 UTC
Not reproduced in

Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: f461c889ac4bc053c306537f644ec4bf3e0d5128
CPU threads: 8; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: es-ES (es_ES.UTF-8); UI: en-US
Calc: threaded
Comment 12 Xisco Faulí 2022-03-21 11:07:20 UTC
Could you please try to reproduce it with the latest version of LibreOffice from https://www.libreoffice.org/download/libreoffice-fresh/ ?
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the latest version.
Comment 13 Peter 2022-03-21 11:31:14 UTC
(In reply to Xisco Faulí from comment #12)
> Could you please try to reproduce it with the latest version of LibreOffice
> from https://www.libreoffice.org/download/libreoffice-fresh/ ?
> I have set the bug's status to 'NEEDINFO'. Please change it back to
> 'UNCONFIRMED' if the bug is still present in the latest version.

That is difficult for me to do because the latest version of Libreoffice crashes with a blank spreadsheet from template for other reason reported in this bug:
https://bugs.documentfoundation.org/show_bug.cgi?id=148069
Comment 14 QA Administrators 2022-03-22 03:57:42 UTC Comment hidden (obsolete)
Comment 15 Roman Kuznetsov 2022-03-31 20:13:28 UTC
I didn't have any memory leaks but I got the 100% CPU loading in

Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: 0a05b1f46263a16c6d40c841a317c3ba9f4d31d6
CPU threads: 4; OS: Mac OS X 12.3.1; UI render: default; VCL: osx
Locale: ru-RU (ru_RU.UTF-8); UI: en-US
Calc: threaded

My steps:

1. Open the example
2. Go to sheet 'INDEX(MATCH)'
3. Select a last row with data on the sheet
4. Try expand/drag it to over 100 rows to down => LO hangs with 100% CPU loading

Julien could you please test it on your machine and if you'll get the same please will create the perfgraph? Thank you
Comment 16 Julien Nabet 2022-04-01 07:13:45 UTC
Created attachment 179245 [details]
Flamegraph

Indeed, I reproduced the hang.
Here's a Flamegraph retrieved on pc Debian x86-64 with master sources updated today (+gen rendering).
Comment 17 Julien Nabet 2022-04-01 07:15:13 UTC
Noel/Eike: thought you might be interested in this perf pb in Calc.
I attached a Flamegraph in a previous comment.
Comment 18 QA Administrators 2024-04-01 03:12:06 UTC Comment hidden (obsolete)
Comment 19 Peter 2024-04-01 10:03:53 UTC
Hi QA Administrators,

As requested I've just run a further tests on new version of LibreOffice. I can't run LibreOffice 24 on my main machine as I'm still running MacOS10.14 which isn't supported, however I'm currently using LibreOffice 7.5.8.2 which is relatively stable in other use cases on my system.

Test Steps:
- I download the Demonstration Spreadsheet attached above.
- I opened it in LibreOffice 7.5.8.2 and waited for a few minutes.
- I noted that LibreOffice was using 2.42GB of ram (I had on other .docx open).
- I clicked around a bit and changed worksheets for 5mins.
- I check ram usage and it had started increasing continuously (3.2, 3.4, 3.6GBs....).
- I force quit LibreOffice

When I've got another machine up and running later OS, I'll try and reproduce in version 24.
Comment 20 Patrick (volunteer) 2024-07-10 10:28:07 UTC
Created attachment 195201 [details]
Sample during hang using the Activity Monitor application
Comment 21 Patrick (volunteer) 2024-07-10 10:34:00 UTC
(In reply to Patrick Luby (volunteer) from comment #20)
> Created attachment 195201 [details]
> Sample during hang using the Activity Monitor application

I am able to reproduce this in my local master build on macOS so this bug is still there.

I was able to obtain a sample during the hang using the Activity Monitor application (see attachment #195201 [details]). Not sure how to interpret the data in the sample but it looks like the Calc code is furiously processing data during the hang.
Comment 22 Noel Grandin 2024-07-10 11:33:44 UTC
Interesting, I don't see a problem with current master, running on Linux. It's slow, but nothing leaks and it doesn't lock up for more than a couple of seconds here and there.
Comment 23 Patrick (volunteer) 2024-07-10 16:38:16 UTC
(In reply to Noel Grandin from comment #22)
> Interesting, I don't see a problem with current master, running on Linux.
> It's slow, but nothing leaks and it doesn't lock up for more than a couple
> of seconds here and there.

I only saw once I changed the steps in comment #15 slightly. I changed steps 3 and 4 to the following:

3. Select only the filled cells in the last row instead of the entire row
4. Drag the "fill" rectangle in the bottom right corner of the selected cells

Selecting an entire row and dragging the row down to a different row worked fine for me.

I will see if I can get any more info by running the CPU Profile option in Xcode's Instruments application.
Comment 24 Patrick (volunteer) 2024-07-10 17:43:44 UTC
Created attachment 195212 [details]
CPU Profile during hang using Xcode's Instruments application
Comment 25 Patrick (volunteer) 2024-07-10 18:02:39 UTC
(In reply to Patrick Luby (volunteer) from comment #24)
> Created attachment 195212 [details]
> CPU Profile during hang using Xcode's Instruments application

I obtained the CPU Profile in attachment #195212 [details] by opening attachment #178982 [details] and doing the following steps:

- Select the "INDEX(MATCH)" sheet
- Scroll down to row 150 and select only the filled cells in that row
- Drag the "fill" rectangle in the bottom right corner of the selected cells down at least 100 rows

From the CPU Profile, it appears that most of the CPU time is being spent in various "mdds" functions within ScMatrixImpl::ExecuteBinaryOp().

If I repeat the above steps but if I first delete all of the formula cells in row 150 that contain an INDEX() formula, there is no hang. But, if I undo the cell deletions, the same hang occurs.

I admit that I am not familiar with the code that handles range formulas such as the INDEX() formula so it is not clear to me what is going on during the hang.
Comment 26 Patrick (volunteer) 2024-07-10 19:05:18 UTC
(In reply to Patrick Luby (volunteer) from comment #25)
> I admit that I am not familiar with the code that handles range formulas
> such as the INDEX() formula so it is not clear to me what is going on during
> the hang.

One more piece of data: unchecking the Data > Calculate > Autocalculate menu item before dragging does not hang. But then, selecting the Data > Calculate > Recalculate menu item causes the hang.

So, it appears that the high CPU time is spent calculating each of the INDEX() formulas in the newly filled cells.
Comment 27 Roman Kuznetsov 2024-07-11 16:38:31 UTC
Kohei, may be you have ideas here ?
Comment 28 Noel Grandin 2024-07-12 06:50:25 UTC
So there are at least three different issues mixed up here

(1) Peter reported a memory leak

(2) Peter reported lots of memory being used

(3) Roman reported high CPU usage when autofilling a sheet

I am pretty sure that (1) is fixed on master, probably by one of Patrick's memory leak fixes.

I am pretty sure that (2) still exists - on my macbook, Activity Monitor says that LibreOffice is using 2G, while on Linux, hprof says that Libreoffice is using 160M, which is a very large difference. Possibly they are measuring different things, but maybe still an issue.

I am pretty sure that (3) is still an issue, but I don't think it is that much of an issue - this is a very complex spreadsheet, and autofilling is always going to be expensive.

Patrick - are you aware of any tools to analyse heap usage on macOS? On Linux I would use heaptrack, and it looks like it might be possible to use that on macOS via homebrew, but last time I tried homebrew, it messed up my LibreOffice build.
Comment 29 Patrick (volunteer) 2024-07-12 12:01:33 UTC
(In reply to Noel Grandin from comment #28)
> I am pretty sure that (2) still exists - on my macbook, Activity Monitor
> says that LibreOffice is using 2G, while on Linux, hprof says that
> Libreoffice is using 160M, which is a very large difference. Possibly they
> are measuring different things, but maybe still an issue.
> 
> I am pretty sure that (3) is still an issue, but I don't think it is that
> much of an issue - this is a very complex spreadsheet, and autofilling is
> always going to be expensive.
> 
> Patrick - are you aware of any tools to analyse heap usage on macOS? On
> Linux I would use heaptrack, and it looks like it might be possible to use
> that on macOS via homebrew, but last time I tried homebrew, it messed up my
> LibreOffice build.

So I had only looked at (3) because on my M1 MacBook Pro, the high CPU usage looks like a hang that will last at least several minutes. On my machine, I have not seen Calc finish and return control to the event loop even after several minutes. Attachment #195212 [details] shows the stack with the highest CPU usage and that stack is running in the main thread and 95.7% of the CPU is being used within ScInterpreter::Interpret().

Since you don't see this hang on Linux, maybe this is a bug only on Mac Silicon machines? Or do you see the same hang as I do on Mac Intel?

I used /Applications/Xcode.app/Contents/Applications/Instruments.app (or "open -a Instruments" in a Terminal) to do the CPU Profiling in attachment #195212 [details] but I can also use the Leaks option in the Instruments application to see all the allocations and their stacks (note: Instruments and lldb can connect to a local or nightly build without any special build steps but neither work with an official release). I will do a run with the Leaks later today to see if I can find which code is allocating so much memory and/or if there are any leaks.
Comment 30 Patrick (volunteer) 2024-07-12 13:52:39 UTC
Created attachment 195272 [details]
Allocations before and during hang using Xcode's Instruments application
Comment 31 Patrick (volunteer) 2024-07-12 14:08:45 UTC
(In reply to Patrick Luby (volunteer) from comment #30)
> Created attachment 195272 [details]
> Allocations before and during hang using Xcode's Instruments application

So I ran the Leaks option in the Instruments application connected to my local Mac Silicon build. In attachment #195272 [details] the graph at the top shows net memory allocated. The left have is before I dragged the fill rectangle and it is about 140 MB of memory usage. After dragging, the memory usage jumps to about 200 MB and stays level while ScInterpreter::Interpret() grinds away.

I used Skia/Metal during my runs so memory usage is higher with Skia/Raster but still level after dragging the fill rectangle. So, I don't see any increasing memory usage. Maybe that problem was fixed by one of my recent memory leak fixes?

Anyway, what I did notice that is that in the LibreOffice preferences, Calc has an "Enable multi-threaded execution" preference. I have been running with this preference enabled (i.e. checkbox is checked). So I disabled (unchecked) this preference and reran a CPU Profile in the Instruments app. Interesting, the highest CPU usage stack wass the same as in attachment #195212 [details] and, like that run, ScInterpreter::Interpret() is grinding away in the main thread.

I am not familiar with the internal workings of the Calc formula interpreter so is it unexpected that ScInterpreter::Interpret() does all its work in the main thread when the "Enable multi-threaded execution" preference is enabled?

In other words, if you disable the "Enable multi-threaded execution" preference on Linux or Mac Intel, do anyone else see a hang? If yes, then maybe multi-threaded execution is not occurring on Mac Silicon machines?
So my questio
Comment 32 Patrick (volunteer) 2024-07-12 18:45:44 UTC
OK. I've narrowed things down a little more. After much trial and error, I found that the following debug patch lowers the time it takes to fill 100 rows from ~8 minutes to ~30 seconds on my M1 MacBook Pro.

Of course, this is not a fix. But it does appear that the big bottleneck is in whatever code processes the ScMatrixRef instance that is returned by ScMatrixImpl::CompareMatrix(). So, next step is to see if I can find where the ScMatrixRef processing bottleneck is:

diff --git a/sc/source/core/tool/scmatrix.cxx b/sc/source/core/tool/scmatrix.cxx
index fcc3b290d3d1..a22ec0472aa3 100644
--- a/sc/source/core/tool/scmatrix.cxx
+++ b/sc/source/core/tool/scmatrix.cxx
@@ -2236,7 +2236,9 @@ ScMatrixRef ScMatrixImpl::CompareMatrix(
     if (nSize != rResVal.size())
         return ScMatrixRef();
 
-    return ScMatrixRef(new ScMatrix(aSize.column, aSize.row, rResVal));
+    ScMatrixRef aMatrix(new ScMatrix(aSize.column, aSize.row, rResVal));
+fprintf(stderr, "Printing just to make sure the above matrix isn't optimized away: %p\n", &aMatrix);
+    return ScMatrixRef();
 }
 
 void ScMatrixImpl::GetDoubleArray( std::vector<double>& rArray, bool bEmptyAsZero ) const
Comment 33 Patrick (volunteer) 2024-07-12 22:31:17 UTC
So ignore all of my recent posts. I have been using my local debug build and apparently a debug build's extra assertions/tests/logging is the cause of most of the slowness I have seen.

I downloaded the recently released LibreOffice 24.2.5 as well as the 12 June 2024 nightly master build here are the results I see on my M1 MacBook Pro when I select the INDEX(MATCH) sheet, select the non-empty cells in row 150, and drag the fill rectangle in down 100 rows (i.e. to row 250):

1. Maximum memory usage in Activity Monitor:
   - Skia/Metal: 600 MB
   - Skia/Raster: 350 MB

2. Length of hang immediately after dragging the fill rectangle:
   - ~25 seconds

So my conclusion is the memory issues originally reported in this are now fixed in LibreOffice 24.2.5 and later due to a combination of my recent memory fixes.

Note for @Peter: unfortunately, my largest memory fix (see tdf#159175) is only available in LibreOffice 24.2.5 and later.

So I think the only remaining issue is whether or not ~25 seconds to perform the filling of 100 rows is similar to how long it takes on Mac Intel, Linux, and/or Windows.
Comment 34 Noel Grandin 2024-07-13 08:20:35 UTC
I don't see anything that can obviously be improved in the the CPU usage. This spreadsheet makes extensive usage of matrix operations.
Calc's matrix code is already fairly optimised.
Comment 35 Patrick (volunteer) 2024-07-13 11:09:07 UTC
(In reply to Noel Grandin from comment #34)
> I don't see anything that can obviously be improved in the the CPU usage.
> This spreadsheet makes extensive usage of matrix operations.
> Calc's matrix code is already fairly optimised.

OK.  So last question and then I think we can mark this as fixed: in comment #28 you mentioned seeing 2 GB of memory usage on macOS. Has that gone down to closer to my numbers in comment #33 with LibreOffice 24.2.5 or a recent nightly build?

2 GB sounds very high to me so I just wanted to make sure there isn't a big memory usage usage issue that I am missing.
Comment 36 Noel Grandin 2024-07-13 17:41:59 UTC
(In reply to Patrick Luby (volunteer) from comment #35)
> 
> OK.  So last question and then I think we can mark this as fixed: in comment
> #28 you mentioned seeing 2 GB of memory usage on macOS. Has that gone down
> to closer to my numbers in comment #33 with LibreOffice 24.2.5 or a recent
> nightly build?
> 


Weird, I don't see that anymore, now I am seeing numbers exactly like what you see. Maybe I was looking in the wrong place.
Comment 37 Patrick (volunteer) 2024-07-13 18:01:02 UTC
(In reply to Noel Grandin from comment #36)
> Weird, I don't see that anymore, now I am seeing numbers exactly like what
> you see. Maybe I was looking in the wrong place.

Good to hear that memory usage is down on your machine. IIRC, my fix for tdf#159175 only works when Skia is enabled. When Skia is disabled, memory usage per window is still quite high especially with a Retina display.

Anyway, I'll mark this bug as "resolved/works for me" since the memory fixes are scattered among several different other bugs.

@Peter if you are able to find a machine that can run LibreOffice 24.2.5, can you let us know if you see a big drop in memory usage? Also, let us know if you still see continually increasing memory usage when filling via dragging?