Bug 81501 - SORT: Large memory footprint while sorting a large table
Summary: SORT: Large memory footprint while sorting a large table
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.6.2 release
Hardware: Other All
: high major
Assignee: supremearyal
URL:
Whiteboard: BSA target:4.4.0.0.beta3 target:4.3.5...
Keywords: bibisected, haveBacktrace, regression
Depends on:
Blocks: Sorting
  Show dependency treegraph
 
Reported: 2014-07-18 16:10 UTC by Clément Lassieur
Modified: 2017-07-25 01:57 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
Valgrind Massif output (1.31 MB, text/x-log)
2014-07-18 16:10 UTC, Clément Lassieur
Details
large file that contains random numbers (100000 rows, 12 columns) (805.06 KB, application/vnd.oasis.opendocument.spreadsheet)
2014-07-18 16:12 UTC, Clément Lassieur
Details
linux backtrace 4.4 (19.25 KB, text/plain)
2014-09-13 12:40 UTC, Yousuf Philips (jay) (retired)
Details
Patch to fix memory use issue while sorting in Calc (631 bytes, patch)
2014-12-03 06:12 UTC, supremearyal
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Clément Lassieur 2014-07-18 16:10:41 UTC
Created attachment 103049 [details]
Valgrind Massif output

Problem description:

Sorting a document - that contains 100000 rows and 12 columns - according to the first column causes a 4.8G memory leak.

Steps to reproduce:

1. Create a document that contains 100000 rows and 12 columns, say with random numbers.
2. Select the whole sheet.
3. Menu → Sort → choose “Column A” as “Key 1”.

Current behavior:

According to “smem -k”, soffice.bin uses 4.8G of memory after sorting is done.

That memory isn’t released when the document is closed.

I ran Valgrind Massif (see attachment), which says that the leak is caused, among others, by that line, which contains a “new”:

table3.cxx:339 “mpRows->push_back(new Row(nColSize))”

(Note that I used massif on a different file, thus the peak isn’t 4.8G.)

There is no leak with libreoffice 3.6.5.2.

Operating System: Linux (Other)
Version: 4.4.0.0.alpha0+ Master
Comment 1 Clément Lassieur 2014-07-18 16:12:50 UTC
Created attachment 103050 [details]
large file that contains random numbers (100000 rows, 12 columns)
Comment 2 Yousuf Philips (jay) (retired) 2014-07-19 03:17:47 UTC
Dear Clement,

Thank you for reporting the bug. Please can you paste the version of LibreOffice that your using, as i have reported a similar bug that goes back to 3.6 regarding sorting and that has been fixed with the latest 4.4 version.
Comment 3 Clément Lassieur 2014-07-19 09:39:43 UTC
Dear Jay,

The version is 4.4.0.0.alpha0+.
The build ID is c6a7c5ee5abb867edbdd5a35f1f6c06cee8cab0b.

I built it yesterday.
Comment 4 Yousuf Philips (jay) (retired) 2014-07-19 15:16:58 UTC
Hi Clement,

I selected columns A through L, as well as Ctrl + A, and noticed that LibO was slow to open the menu as well as the dialog box, but once i told it to sort, it worked fine.

Version: 4.4.0.0.alpha0+
Build ID: b9dca968c6fd0ab5ca140c65b0e54d153cd34986
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:master, Time: 2014-07-19_00:10:17

If i test the build i used when i reported bug 81351, it crashes with Ctrl + A but not selecting columns A through L.

Version: 4.4.0.0.alpha0+
Build ID: 3fdd4f069d5436cf39708004af7fda8175fbc4c2
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:master, Time: 2014-07-09_02:52:05

This is why i had asked that you test the latest build, meaning, official build. Please test the latest official build and see what the results are on your end.
Comment 5 Jean-Baptiste Faure 2014-07-19 21:04:35 UTC
In case you did not check already: no problem in version 4.3.1.0.0+ if I select only the 12 columns. The sort is done in less than 1 second.
If I select the whole sheet (ctrl+A) LibreOffice eat 2.5 GB of RAM (on 4 GB) and the PC becomes unresponsive (big use of the swap: ~ 3.8 GB).
Not patient enough to wait if LO is able to complete the sort.

Tested under Ubuntu 14.04 x86-64

Best regards. JBF
Comment 6 Yousuf Philips (jay) (retired) 2014-07-20 07:33:09 UTC
Am i to assume that 4.4 worked fine for you and you are now testing 4.3?
Comment 7 Jean-Baptiste Faure 2014-07-20 08:12:15 UTC
(In reply to comment #6)
> Am i to assume that 4.4 worked fine for you and you are now testing 4.3?

No, didn't test 4.4. Currently I am working on 4.3 and 4.2 only.

Best regards. JBF
Comment 8 Clément Lassieur 2014-07-21 13:46:14 UTC
Hi all,

I tried with the latest official build:
 Version: 4.4.0.0.alpha0+
 Build ID: a82ff18269e5b37348d402b7c21c3f200068265c
 TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-07-20_02:34:22

Indeed, everything is fine if I select only columns A through L, but it crashes if I do Ctrl + A. I believe it’s the OOM killer.
Comment 9 raal 2014-09-13 06:18:32 UTC
I can reproduce crash with LO 4.3.1, win7 when sort on all cells (ctrl+a).
I can reproduce crash with Version: 4.4.0.0.alpha0+
Build ID: e2723d00b77dc1044e2ba599ba93517af34e1ea5
TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-09-09_23:17:41

Program terminated with signal SIGKILL, Killed.
Comment 10 Yousuf Philips (jay) (retired) 2014-09-13 12:39:49 UTC
Tested and it doesnt crash on 4.0.6 and 4.1.6, but does crash in 4.2.7, 4.3.2 and 4.4.

Version: 4.2.7.0.0+
Build ID: 6db17c300e59884e652125a0b3b11bdffdea38e2
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:libreoffice-4-2, Time: 2014-09-11_14:25:14

Version: 4.3.3.0.0+
Build ID: f3cdb7804a26c5cf0623d9b83130594f83372768
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:libreoffice-4-3, Time: 2014-09-12_04:11:44

Version: 4.4.0.0.alpha0+
Build ID: ed95e1c5619e2cb2a8f6d93a1b7c45f36f1524dd
TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:master, Time: 2014-09-12_11:30:45
Comment 11 Yousuf Philips (jay) (retired) 2014-09-13 12:40:15 UTC
Created attachment 106212 [details]
linux backtrace 4.4
Comment 12 Yousuf Philips (jay) (retired) 2014-09-13 14:54:47 UTC
Confirmed crash on 4.3.3 on Windows.
Comment 13 Jacques Guilleron 2014-09-13 22:03:59 UTC
Hi all,

I tested sort on the whole sheet.
Last version where this work:
LO 4.2.4.2 Build ID: 63150712c6d317d27ce2db16eb94c2f3d7b699f8
Fist version where this crash:
LO 4.2.5.1 Build ID: 881bb88abfe2992c6cede97c23e64a9885de87de
On Windows 7 Home Premium.

regards,

Jacques
Comment 14 raal 2014-09-14 06:49:51 UTC
e02439a3d6297a1f5334fa558ddec5ef4212c574 is the first bad commit
commit e02439a3d6297a1f5334fa558ddec5ef4212c574
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Thu Oct 17 10:15:34 2013 +0000

    source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
    
    commit 6b8393474974d2af7a2cb3c47b3d5c081b550bdb
    Author:     Luboš Luňák <l.lunak@suse.cz>
    AuthorDate: Tue Jun 4 13:24:59 2013 +0200
    Commit:     Luboš Luňák <l.lunak@suse.cz>
    CommitDate: Wed Jun 5 16:01:43 2013 +0200
    
        fix gcc inline assembler operands usage
    
        Apparently whoever did these didn't get the gcc docs and specified
        every operand only as input, and then added volatile, explicit
        initialization and what not until it worked. Specify output operands
        correctly instead.
        I couldn't verify all assembler variants, as I don't know them,
        but the ones I don't know had at least some proper usage of output
        operands, so I'll assume those are all correct.
    
        Change-Id: I2910308b5e00cce8db756496df50ed26cfe35bb6

:100644 100644 49c023e2d36621396be9608a0a64bb670ac44e56 0057ddf8fa7aac0cc36b2b4cc3d47cb3b222e3de M	ccache.log
:100644 100644 1f724a450a2ba6123dc2772c4779157757705897 072f6720e641bd0f70bebcb3c47cfc6a55159e50 M	commitmsg
:100644 100644 058e9f056380574e4073b082180e87b9646d4e8b ecff2407bcf7e1bbf59642b9e021df82e44f526c M	dev-install.log
:100644 100644 115a24f6a9e45b293b22377dc409001a05069581 2b2cbad4f516bc1d6abed9e776c86ea90e3e757f M	make.log
:040000 040000 2d7a96e6b40994d4cff6f226ff2d54947490dabc 4ca7969f1b45eb47375749fa2b77e239eeb9e77b M	opt

git bisect log
# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574
# good: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15
git bisect good 8f4aeaad2f65d656328a451154142bb82efa4327
# good: [9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02] source-hash-8600bc24bbc9029e92bea6102bff2921bc10b33e
git bisect good 9995fae0d8a24ce31bcb5e9cd0459b69cfbf7a02
# good: [8ad82bc1416a07501651e8d96fe268e47d3931d3] source-hash-13821254f88d2c5488fba9fe6393dcf4ae810db4
git bisect good 8ad82bc1416a07501651e8d96fe268e47d3931d3
# good: [d084d250b04446535ca1d7c29cf2062e6bd042b3] source-hash-688f72e3a2c3ef923389bbd21f6aea3afe1114db
git bisect good d084d250b04446535ca1d7c29cf2062e6bd042b3
# good: [c2069a369d738078124812312d51f21ea1ce2421] source-hash-f160e4935c474a5293b3d3c11b3d538efb4767a0
git bisect good c2069a369d738078124812312d51f21ea1ce2421
# good: [a0f20bc04a32a7791ba765d2de2f44f1b74033d1] source-hash-1de66ba440855050a794b3b2a8647c1b02c210b8
git bisect good a0f20bc04a32a7791ba765d2de2f44f1b74033d1
# good: [a48fbf799e4d4d555fe383b7233c804f573eca4e] source-hash-bb6ecd8b40313b7cc83d4e619029f4e001334a52
git bisect good a48fbf799e4d4d555fe383b7233c804f573eca4e
# good: [e7cbc9b8764280fd4799e234ca32925e91547a82] source-hash-31b35ed6bb7fe77f3f276b00fefce112a620b6ac
git bisect good e7cbc9b8764280fd4799e234ca32925e91547a82
# first bad commit: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
Comment 15 Kohei Yoshida 2014-09-19 13:47:56 UTC
Valgrind massif is a heap profiler and is not a memory leak detector.  You need to use valgrind --tool=memcheck --leak-check=full to detect memory leak.

What you describe is more accurately referred to as "space leak", which is not the same as memory leak.

c.f. http://valgrind.org/docs/manual/ms-manual.html
Comment 16 Clément Lassieur 2014-10-28 13:02:14 UTC
I can’t reproduce the bug with [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb.

I did the bibisection again (I had to disable some unit tests to build f4a0757).

f4a075728f62f0083a4bffd40d3c02265082d962 is the first bad commit
commit f4a075728f62f0083a4bffd40d3c02265082d962
Author: Kohei Yoshida <kohei.yoshida@collabora.com>
Date:   Fri Apr 18 00:29:55 2014 -0400

    Avoid using SwapRow() when sorting.
    
    Instead, build a data table prior to sorting, swap rows in this data table
    as we sort, then transfer the results back to the document in one step.  This
    significantly speeds up the sort performance.
    
    Formula cells are yet to be handled. I'll work on that next.
    
    Change-Id: I59bde1a243dc8940411d1a33eef147671b060cd0

:040000 040000 d727e2484d51033b18dcf7e08e0faa04dd71f709 45bc5e7ae06b375211edbe48e8227158f5e510a8 M	sc

Note that I can’t reproduce the bug with it’s parent 2105070.
Comment 17 supremearyal 2014-12-03 06:12:54 UTC
Created attachment 110391 [details]
Patch to fix memory use issue while sorting in Calc

Hey guys,

I'm new to Libreoffice development, but I thought I might give a stab at this bug. I tracked the bug to Calc trying to sort extraneous columns (1024 instead of the 12). I've attached a patch which fixes the issue for me. Let me know if you need more info.
Comment 18 Winfried Donkers 2014-12-03 06:47:44 UTC
(In reply to supremearyal from comment #17)
> Created attachment 110391 [details] [review]
> Patch to fix memory use issue while sorting in Calc

Hi supremearyal,

May your first fix for LibreOffice be the first of many :-)
I have taken the liberty to make you assignee for this bug.

I suggest you submit your patch via gerrit:
https://wiki.documentfoundation.org/Development/gerrit/SubmitPatch

This way it will not be overlooked, automated build tests for linux/max/windows can be started there and comments can be placed in the code (if need be).
It will be best to add Eike Rathke and Kohei Yoshida as reviewers in gerrit, as they are the Calc sort experts.

O, and as this is your first patch, could you please submit a license statement (see same link above)?
Comment 19 Commit Notification 2014-12-05 16:57:23 UTC
Supreme Aryal committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c50d8bf596f527bb53b86dd04da652ed72f2c549

Fix high memory usage when sorting cells in Calc. (fdo#81501)

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Eike Rathke 2014-12-05 17:12:04 UTC
(In reply to Clément Lassieur from comment #0)
> I ran Valgrind Massif (see attachment), which says that the leak is caused,
> among others, by that line, which contains a “new”:
> 
> table3.cxx:339 “mpRows->push_back(new Row(nColSize))”

This is odd, because the ScSortInfoArray dtor has
std::for_each(mpRows->begin(), mpRows->end(), boost::checked_deleter<Row>())
Comment 21 Commit Notification 2014-12-05 18:07:05 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1e4235f8b2dc693b0fb1edade9db25a631bdbf94

actually use identical code for both byRow and byCol, fdo#81501 follow-up

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2014-12-05 18:07:09 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c7b57c6c2b15a4b9f65dafb642e3f1da402e62ca

trim also empty leading column ranges, fdo#81501 follow-up

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 23 Commit Notification 2014-12-05 19:01:59 UTC
Supreme Aryal committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bc3755e041a6cdb678fb8c8093beb6b3a8e3f0e0&h=libreoffice-4-4

Fix high memory usage when sorting cells in Calc. (fdo#81501)

It will be available in 4.4.0.0.beta3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 24 Commit Notification 2014-12-05 19:02:02 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ded77b3983f1ccf7f7b611c59da9f2156ea7a0df&h=libreoffice-4-4

actually use identical code for both byRow and byCol, fdo#81501 follow-up

It will be available in 4.4.0.0.beta3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 25 Commit Notification 2014-12-05 19:02:06 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ba410bdb74e2f1fcc011934380890fc7fa872777&h=libreoffice-4-4

trim also empty leading column ranges, fdo#81501 follow-up

It will be available in 4.4.0.0.beta3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 26 Jean-Baptiste Faure 2014-12-06 12:05:50 UTC
Verified fixed for me in Version: 4.4.0.0.beta2+
Build ID: 8a1a306799da41f1ed456828e8f4f7353c5eeea7 under Ubuntu 14.10 x86-64

Thank you very much.
Best regards. JBF
Comment 27 Commit Notification 2014-12-11 18:59:25 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-3-5":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7a085f8811c24e4a0d472690f426fcc3ff17e592&h=libreoffice-4-3-5

Ctrl+A and Data Sort took ages to broadcast ALL cells, fdo#81501 related

It will be available in 4.3.5.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 28 Commit Notification 2015-01-14 10:13:34 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=198c14e95693bd8af3fc36ec5d97d95a2180f85a&h=libreoffice-4-3

Ctrl+A and Data Sort took ages to broadcast ALL cells, fdo#81501 related

It will be available in 4.3.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 29 Robinson Tryon (qubit) 2015-12-17 08:26:55 UTC Comment hidden (obsolete)