Bug 100244 - CRASH: Pivot table seems to cause massive memory leak
Summary: CRASH: Pivot table seems to cause massive memory leak
Status: ASSIGNED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high major
Assignee: Dave Gilbert
URL:
Whiteboard: target:26.8.0
Keywords: haveBacktrace, perf
Depends on:
Blocks: Pivot-Table Memory Crash
  Show dependency treegraph
 
Reported: 2016-06-06 21:06 UTC by Bob
Modified: 2026-04-09 00:43 UTC (History)
12 users (show)

See Also:
Crash report or crash signature:


Attachments
ods with pivot table that causes memory leak (475.26 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-06-06 21:06 UTC, Bob
Details
Simplified smaller version of the original document (209.92 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-10-09 16:15 UTC, Dennis Francis
Details
bt at random with debug symbols (7.35 KB, text/plain)
2017-08-12 22:55 UTC, Julien Nabet
Details
perf flamegraph with simplified example (66.97 KB, application/x-bzip)
2019-09-24 19:13 UTC, Julien Nabet
Details
Correctness test case (1b) (14.53 KB, application/vnd.oasis.opendocument.spreadsheet)
2026-01-28 21:53 UTC, Dave Gilbert
Details
Correctness test case (1c) (15.11 KB, application/vnd.oasis.opendocument.spreadsheet)
2026-01-28 21:55 UTC, Dave Gilbert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bob 2016-06-06 21:06:46 UTC
Created attachment 125520 [details]
ods with pivot table that causes memory leak

Refreshing the pivot table in the attached file causes LO memory consumption to spike.  Similar results between Windows 10 and Ubuntu 14.04 on the 5.1.3.2 release.

Warning:  Be prepared to kill LO when you refresh this pivot table!  If you have a system with LOTS of memory, it will eventually refresh - for me, resident memory reaches 0.05t with virtual over 52g (on linux "top").

A strange/temporary "fix" is to "Edit Layout.." and remove column "item_num" *prior* to Refreshing pivot table.  OR remove the "Data Fields" where a "Difference From" display option is set.

Note:  The source range is only 8300 rows by 12 columns, so not "huge" by any means.

My guess is that its some combination of "Row fields" and the "Difference From" that I'm using in the "Data Fields" that is causing the issue.

-Bob
Comment 1 raal 2016-06-07 09:52:33 UTC
Version: 5.2.0.0.alpha1+; win7  crash with "bad allocation" error
Comment 2 Katarina Behrens (Inactive) 2016-06-10 10:39:34 UTC
Is this a regression? i.e. did it ever work reasonably and only got worse in recent releases?
Comment 3 Bob 2016-06-10 20:24:09 UTC
This is the first time I've experienced this problem - and I've never tried using the "Difference From" display option before now.

I did test it using OpenOffice 3.2 on Ubuntu 14.04, and it behaved badly as well.
Comment 4 Julien Nabet 2016-06-19 08:32:28 UTC
On pc Debian x86-64 with master sources updated yesterday, I got an hang.

I noticed a lot of these before the hang:
warn:sc:4374:1:sc/source/core/data/dptabsrc.cxx:2696: ScDPMember::GetItemData: what data? nDim 15, mnDataId 2830
Comment 5 Dennis Francis 2016-10-09 16:13:43 UTC
I am working on this one and have some updates.

I have created a reduced row and simpler version of the original document(attached) such that it cause large memory usage on refresh of pivot table but fit in my system's RAM (16GB) even when calc run with valgrind-memcheck. It uses about lots of memory, but valgrind's memcheck tool shows no big leaks (all are "definitely lost ones are in KB's").

valgrid's massif tool reports that large number of allocations happen at 
ScDPResultDimension::AddMember(ScDPParentDimData const&) (dptabres.cxx:3961)

However I observed that the high memory usage after pivot table refresh does not go away even after I close the document (with calc still running with no other document). After some code hunting I suspect this behavior is due to holding of ScDPSource object inside ScDPObject object via a uno Reference which may be shared with other objects which do not get de-allocated while closing the document having pivot tables. It might only be released after calc is closed. This would explain the valgrind-memcheck's result.

I will next try to trace the reference count changes of ScDPSource object via gdb.

Calc version - latest master built 7th Oct 2016.
OS : Fedora 24 64 bit.
Comment 6 Dennis Francis 2016-10-09 16:15:31 UTC
Created attachment 127895 [details]
Simplified smaller version of the original document
Comment 7 Julien Nabet 2017-06-05 14:30:56 UTC
Dennis: any update here?
Comment 8 Dennis Francis 2017-06-05 14:57:30 UTC
Sorry, I did not proceed further. I will take a look again soon. Till then I am changing the status to NEW and have reset the assignee, so others have a chance to work on this.
Comment 9 Xisco Faulí 2017-06-19 10:02:48 UTC
it crashes for me when I click on cust_name pivot table

Version: 6.0.0.0.alpha0+
Build ID: 9e4502f0e393d2bc2810488b3ebb0a5c23038436
CPU threads: 1; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-06-16_08:52:00
Locale: es-ES (es_ES); Calc: group

In

Version: 6.0.0.0.alpha0+
Build ID: 08f6f9dded1b142b858c455da03319abac691655
CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; VCL: gtk2; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

it hangs
Comment 10 Julien Nabet 2017-08-12 22:55:39 UTC
Created attachment 135505 [details]
bt at random with debug symbols

On pc Debian x86-64 with master sources updated yesterday, it hangs when clicking on cust_name with the second attachment.

I attached a bt at random
Comment 11 QA Administrators 2018-09-20 02:51:07 UTC Comment hidden (obsolete)
Comment 12 Roman Kuznetsov 2018-09-20 12:02:31 UTC
still repro CRASH without any crashreport in 

Version: 6.2.0.0.alpha0+
Build ID: ace6bbf3da9ae27aca87865b6be887a3aed341fc
CPU threads: 4; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2018-09-20_05:45:56
Locale: ru-RU (ru_RU); Calc: threaded

@Xisco: why isn't it critical?
Comment 13 Xisco Faulí 2018-09-20 18:36:31 UTC
> @Xisco: why isn't it critical?

It's an old one, inherit from OOo times...
Comment 14 Julien Nabet 2019-04-21 08:42:00 UTC
On pc Debian x86-64 with master sources updated today, I gave it a new try with initial Bob's file.

I noticed these logs:
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 0
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 1
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 2
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 3
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 4
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 5
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 6
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 7
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 8
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 9
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 10
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 11
warn:sc.core:19920:19920:sc/source/core/data/dptabsrc.cxx:2615: ScDPMember::GetItemData: what data? nDim 13, mnDataId 12
etc.

+ this bt at random:
#0  0x00007fffddc4f410 in ScDPResultMember::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool) (this=0x5556094292b0, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=4, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:1077
#1  0x00007fffddc556f9 in ScDPResultDimension::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool)
    (this=0x5556093d5770, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=3, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:2846
#2  0x00007fffddc4f444 in ScDPResultMember::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool) (this=0x5556093d5660, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=3, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:1077
#3  0x00007fffddc556f9 in ScDPResultDimension::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool)
    (this=0x555605ae9c00, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=2, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:2846
#4  0x00007fffddc4f444 in ScDPResultMember::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool) (this=0x555605ae9af0, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=2, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:1077
#5  0x00007fffddc556f9 in ScDPResultDimension::InitFrom(std::__debug::vector<ScDPDimension*, std::allocator<ScDPDimension*> > const&, std::__debug::vector<ScDPLevel*, std::allocator<ScDPLevel*> > const&, unsigned long, ScDPInitState&, bool)
    (this=0x5555ec0d1390, ppDim=std::__debug::vector of length 5, capacity 8 = {...}, ppLev=std::__debug::vector of length 5, capacity 8 = {...}, nPos=1, rInitState=..., bInitChild=true)
    at /home/julien/lo/libreoffice/sc/source/core/data/dptabres.cxx:2846
...
etc.
Comment 15 Oliver Brinzing 2019-09-23 17:01:15 UTC
still reproducible with:

Version: 6.4.0.0.alpha0+ (x64)
Build ID: 71ef762f21ada8c25aad2183065478171e985e8c
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: threaded
Comment 16 Xisco Faulí 2019-09-24 07:43:43 UTC
@Julien, is it possible to have a perf chart here ;-) ?
Comment 17 Julien Nabet 2019-09-24 08:29:26 UTC
(In reply to Xisco Faulí from comment #16)
> @Julien, is it possible to have a perf chart here ;-) ?
No pb.
I can also retrieve a Valgrind trace, perhaps as for Flamegraph with enable-symbols build instead of enable-dbgutil build.
Comment 18 Julien Nabet 2019-09-24 19:13:25 UTC
Created attachment 154453 [details]
perf flamegraph with simplified example

On pc Debian x86-64 with master sources updated today + enable-symbols, I retrieved a Flamegraph
Comment 19 Xisco Faulí 2019-09-25 11:19:02 UTC
Hi Noel,
I thought you might be interested in this issue...
Comment 20 Roman Kuznetsov 2020-12-25 16:38:41 UTC
the memory leak is still here (over 5Gb) in

Version: 7.2.0.0.alpha0+ (x64)
Build ID: ad9e04321df25824d2288a2ef1f4275f070f1cf7
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: CL

but I didn't get any crashes, possibly because I use 64 bit OS and LO
Comment 21 Roman Kuznetsov 2022-06-19 21:47:44 UTC
(In reply to Roman Kuznetsov from comment #20)
> the memory leak is still here (over 5Gb) in
> 
> Version: 7.2.0.0.alpha0+ (x64)
> Build ID: ad9e04321df25824d2288a2ef1f4275f070f1cf7
> CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render:
> Skia/Raster; VCL: win
> Locale: ru-RU (ru_RU); UI: ru-RU
> Calc: CL
> 
> but I didn't get any crashes, possibly because I use 64 bit OS and LO

Memory leak still here in

Version: 7.5.0.0.alpha0+ / LibreOffice Community
Build ID: e4d23c27288b99c3ed3cfa332ff308b31c01f97d
CPU threads: 4; OS: Linux 5.14; UI render: default; VCL: gtk3
Locale: ru-RU (ru_RU.UTF-8); UI: en-US
Calc: threaded Jumbo
Comment 22 Pavel Kysilka 2024-03-30 10:10:59 UTC
I tested this bug today and the app is consuming about 260MB of RAM.

Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 13250e1aa589453534d113442da1ac8a2cbb71b9
CPU threads: 128; OS: Linux 6.1; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Calc: threaded
Comment 23 Roman Kuznetsov 2024-03-30 18:26:14 UTC
(In reply to Pavel Kysilka from comment #22)
> I tested this bug today and the app is consuming about 260MB of RAM.
> 
> Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
> Build ID: 13250e1aa589453534d113442da1ac8a2cbb71b9
> CPU threads: 128; OS: Linux 6.1; UI render: default; VCL: gtk3
> Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
> Calc: threaded

Pavel did you try just update the pivot table in the file?

the memory leak is still here (over 11 Gb and my Windows 10 just died instead just kill LO process!)

Version: 24.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 2146e66d8df2b7b6a2dd868e886cae76aaf7f48b
CPU threads: 16; OS: Windows 10.0 Build 19045; UI render: Skia/Vulkan; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: CL threaded
Comment 24 Pavel Kysilka 2024-03-30 20:29:37 UTC
Roman, you're right. After the refreshing pivot table, Calc consumes 61GB of RAM.

There is a similar error with pivot tables.

https://bugs.documentfoundation.org/show_bug.cgi?id=126710
Comment 25 Dave Gilbert 2025-12-13 16:59:31 UTC
Hmm yes, I can confirm that on current head (92b013bdb4d567b)
hitting 10.7g virt/5.2g res on a refresh.
It's reasonable at load and only goes crazy at refresh.

I'll have a look.
(At least 10G is not too hard to work with these days...)
Comment 26 Dave Gilbert 2025-12-14 14:21:31 UTC
Note 1: Changing almost anything makes this leak go away for me - e.g. adding or removing a field to the pivot table.

Note 2: I managed to get this to run in memcheck, but it hasn't found much useful,  it doesn't look like most of it is leaked at the end:

==572040== HEAP SUMMARY:
==572040==     in use at exit: 3,187,666 bytes in 26,266 blocks
==572040==   total heap usage: 47,533,605 allocs, 47,507,339 frees, 5,420,734,732 bytes allocated
==572040== 
==572040== LEAK SUMMARY:
==572040==    definitely lost: 3,072 bytes in 9 blocks
==572040==    indirectly lost: 364,654 bytes in 7,549 blocks
==572040==      possibly lost: 32 bytes in 1 blocks
==572040==    still reachable: 2,812,860 bytes in 18,683 blocks
==572040==                       of which reachable via heuristic:
==572040==                         newarray           : 362,368 bytes in 3 blocks
==572040==                         multipleinheritance: 304 bytes in 2 blocks
==572040==         suppressed: 5,120 bytes in 2 blocks
==572040== Rerun with --leak-check=full to see details of leaked memory

Curious it only shows 5G total heap usage when I saw it go over 10G in 'top';  which I guess suggests what ever is allocated it's allocated on some humungous structure which is eventually freed.
Comment 27 Dave Gilbert 2025-12-20 01:56:52 UTC
I've peppered the code with calls to getrusage()
Almost all the memory seems to be from the second call to ScDPResultMember::InitFrom
from ScDPSource::CreateReq_Impl.

InitFrom seems to be hopelessly recursive so I've not figured it out yet, but I can see that
there are a total of over 21M calls to InitFrom which is ludicrously higher than either the number of input or output cells.
Comment 28 Dave Gilbert 2025-12-21 02:36:24 UTC
I think I understand what's happening, although not yet what to do about it:

The row fields are cust_name, sa_item1, item_num and desc_33 - so in theory you can generate one row for each combination of different values for each of those; although combinations that don't exist are hidden.

Now, there are 158 different cust_names, 9 sa_item1's, 129 item_nums and 130 desc_33's;
158*9*129*130=23.8M potential different rows. (I counted 21M calls to InitFrom, so I'm off a little, but in the ballpark).

Now, the original report said removing the column 'item_num' would avoid the problem; well that owuld divide the number of potential rows by 129; so meh only 185000 rows - so that's why that causes no problems.

There is however a 'Late init' which tries to avoid this - by only allocating non-zero rows; however it's turned off in any case where one output column references another, and that's what the 'difference from' field is doing:

See:
   void ScDPSource::CreateRes_Impl()
                //  need fully initialized results to find reference values 
                //  (both in column or row dimensions), so updated values or
                //  differences to 0 can be displayed even for empty results.
                bLateInit = false;

So that explains the why;  the question is whether there's a way to turn late init back on in some reference cases.
Comment 29 Dave Gilbert 2025-12-23 01:51:00 UTC
I'm leaning towards just always turning lazyinit on, even with references.

My reading of the spec is that there are two types of reference:
  a) Within a result set - e.g. difference from another specified field or indirected via another field, but still within the same result line
  b) 'previous' or 'next' for running differences

In (a) the fact we're calculating the field seems to mean that this result is live anyway
In (b) the spec explicitly says that it must skip 'hidden' result lines when finding the previous/next one

So, I don't see any case where it would really use a non-allocated result set.

Just removing that 'bLateInit = false' apparently works for both this and for tdf#126710 - and make check still passes for me.

That 'bLateInit = false' looks like it originally came from 35cdea5a6f1d9 in 2004; but the commit message doesn't offer any clues.

IMHO LateInit is a necessity otherwise we're bound to hit this allocation issue - so forcing it and then watching for the fallout might be the right way forward; not that I've found any fallout yet.

I think before submitting that as a fix, I'd like to understand it a little more though; in particular the 'what data?' warnings that Julien mentioned in comment#14.
Comment 30 Dave Gilbert 2025-12-24 02:26:52 UTC
Hmm,  removing the LateInit mostly works but I do have a testfile which does change the output; and the current output matches MS 365/OneDrive.
So I think it needs some more subtlety.
I suspect it's something like it's the result columns always need clearing, but will investigate.
Comment 31 Dave Gilbert 2026-01-28 21:53:58 UTC
Created attachment 205241 [details]
Correctness test case (1b)

This is a test case which changes if I just turn the lazy allocation on for the pivot table;   but it (mostly) seems to match what MS does.
Comment 32 Dave Gilbert 2026-01-28 21:55:37 UTC
Created attachment 205242 [details]
Correctness test case (1c)

Another test case which changes with lazy allocation.
In particular do a refresh and watch cell D12 change from 23.53%.
Now, if I understood why it was 23.53% I'd be most of the way there.
Comment 33 Dave Gilbert 2026-03-19 01:26:33 UTC
I've got a set that shaves some memory off this, but not enough.

https://gerrit.libreoffice.org/c/core/+/202047

(2 are WIP which are the bulk)
They shave ~12% peak RAM usage off - which is about 650MB on that test, which is nice and all that; but doesn't really help this memory explosion (or 126710's ) until much more disappears.
Still got some ideas.

One thing I tried which didn't help at all was replacing the ScDPResultDimension::MemberHash by a flat_segment_tree - it was bigger.
Comment 34 Dave Gilbert 2026-03-19 16:13:06 UTC
New version posted, now the saving is about 30%, or 1.8G in the test in this bug.
Which is a big improvement, but there's still depressingly about 60%+ to get rid of!
Comment 35 Dave Gilbert 2026-03-23 00:17:41 UTC
With the latest version, where I replace the maMemberHash by an unordered_map rather than a map, I think I'm at about 37% saving.

So time to figure out where the current memory is:

  (In ScDPResultDimension)
  Removing maMemberHash gets size down to
     Maximum resident set size (kbytes): 2348988
     so ~970MB or 30% !

  Duplicating maMemberArray is touchy, but size goes up to
       Maximum resident set size (kbytes): 3666544
    so extra ~340MB or 10%

I can account for about 90ish%:
  10% is the amount without doing a refresh - so base
  42% is the cumulative size of the ~21M ScDPResultMembers at 72bytes each
  30% is maMemberHash in ScDPResultDimension - even after the use of unordered_map
       (Measured by removing it)
  10% is the maMemberArray in ScDPResultDimension (measured by duplicating it)

  ----
  92%

Hmm, reductions in size of ScDPResultMembers is getting harder; I've got one or two ideas; but nothing too obvious now.
maMemberHash feels like there should be a better way; it's just a performance efficiency thing above searching maMemberArray linearly.
Comment 36 Noel Grandin 2026-03-23 06:14:03 UTC
(In reply to Dave Gilbert from comment #35)
>   30% is maMemberArray in ScDPResultDimension - even after the use of
> unordered_map
>        (Measured by removing it)
>   10% is the maMemberArray in ScDPResultDimension (measured by duplicating
> it)
> 

Nice work! 

Unless you need the original order of members in maMemberArray, consider replacing both maMemberArray and maMemberArray with o3tl::sorted_vector.
That will get you fast searching and compact storage.
Comment 37 Dave Gilbert 2026-03-23 16:20:46 UTC
(In reply to Noel Grandin from comment #36)
> (In reply to Dave Gilbert from comment #35)
> >   30% is maMemberArray in ScDPResultDimension - even after the use of
> > unordered_map
> >        (Measured by removing it)
> >   10% is the maMemberArray in ScDPResultDimension (measured by duplicating
> > it)
> > 
> 
> Nice work! 
> 
> Unless you need the original order of members in maMemberArray, consider
> replacing both maMemberArray and maMemberArray with o3tl::sorted_vector.
> That will get you fast searching and compact storage.

Thanks for the tip!
I've not entirely got my head around the different cases for the maMemberArray and maMemberHash.
There are at least two different things they might be indexed by - one I think is
probably a simple row/col in the source maybe; the other is a sort-order for example if you sort by value; I think.
So I think there are some combinations when maMemberArray is indexed by something that's not the same key as maMemberHash.
Then there's the way they're built; there's a lazy build mode which creates nodes as needed - and that's the normal case which doesn't explode ram, because most nodes are actually empty; it's the case when one part of the pivot table references another when it gives up on that and builds them all at the start (and that's the case which explodes the ram);  I *think* in this mode is the one where the order maMemberArray is built is relevant.
I think.
There's so many damn flags/combinations/hidden menus/etc that I'm still getting my head around it.
Comment 38 Commit Notification 2026-03-30 20:15:52 UTC
Dr. David Alan Gilbert committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/004d1815d8d4a6abbfcdc4e5ecb0c7f6cf296d42

tdf#100244 sc: Reorder fields in ScDPResultMember to avoid pad

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 39 Commit Notification 2026-03-30 20:16:55 UTC
Dr. David Alan Gilbert committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2280bb5e2a4b4e01990f46e961b28f3416dba16d

tdf#100244 sc: Only allocate ColTotal when needed in pivot

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 40 Commit Notification 2026-03-30 20:17:58 UTC
Dr. David Alan Gilbert committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/01e12689322592e405968d9363d2cd0e5e25a2b3

tdf#100244 sc:Use std::unordered_map for MemberHash

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 41 Commit Notification 2026-03-30 20:19:01 UTC
Dr. David Alan Gilbert committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/4da55aa2082e9955aca1c7b51ba1ec8aceeb1aae

tdf#100244: sc:For non-lazy init ignore hash

It will be available in 26.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 42 Dave Gilbert 2026-03-30 23:48:40 UTC
A note on something that didn't work out;
ScDPResultMember is currently 72bytes, and we allocate many millions of them; one part of it is an instance of ScDPParentDimData which has members, 

mpParentDim, mpParentLevel, mpMemberDesc, mnOrder

Members have an index for their level, and levels have an index for their dimension; so in theory it feels like we should be able just to store the mpMemberDesc
The gotcha is that ScDPResultMember::FillMemberResults which fills the MemberDesc in, reads the Level pointer before it's setup the MemberDesc pointer; and similarly the code that sets up the Level reads the Parent - so while we don't need them at the end, we do need them for the setup code.
Comment 43 Dave Gilbert 2026-04-09 00:43:18 UTC
https://gerrit.libreoffice.org/c/core/+/203342  is a messy WIP branch that avoids the memory overhead of the maMemberArray by allocating it all in a single block in the non-lazy case.
It saves about another 5-10% and this test is down to about 2.1G peak.
tdf#126710's is now down to *only* about 70G - hmm.

I'm pretty sure most of the rest is just the ScDPResultMember's; so we need a way to either compress them, be more efficient or find cases where we just don't need to store them.
The tricky bit is then what to do if we find we do need to store it.

(Note this branch I list above is crazy messy, I'll tidy it up)