Bug 133801 - Sorting a column uses 600 MB at peak, with LibO 4.2 90 MB (with autofilter on an empty row)
Summary: Sorting a column uses 600 MB at peak, with LibO 4.2 90 MB (with autofilter on...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3 all versions
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Memory
  Show dependency treegraph
 
Reported: 2020-06-08 18:27 UTC by Telesto
Modified: 2024-02-09 04:18 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file (48.07 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-06-08 18:27 UTC, Telesto
Details
Bibisect log (3.54 KB, text/plain)
2020-06-08 18:28 UTC, Telesto
Details
Bibisect log (4.24 KB, text/plain)
2020-06-08 18:37 UTC, Telesto
Details
Example file (30.31 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-06-09 09:36 UTC, Telesto
Details
Screencast (760.53 KB, video/mp4)
2020-06-25 10:17 UTC, Telesto
Details
sorting col F and E with 7.1, win7, no mem peak, (92.03 KB, image/jpeg)
2020-06-25 12:16 UTC, b.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2020-06-08 18:27:27 UTC
Description:
Sorting a column uses 600 MB, with LibO 4.2 90 MB

Steps to Reproduce:
1. Open the attached file
2. Sort column F ascending

Actual Results:
600 MB ram

Expected Results:
90 MB ram


Reproducible: Always


User Profile Reset: No



Additional Info:
Found in
Version: 6.4.0.0.alpha1+
Build ID: 9bc848cf0d301aa57eabcffa101a1cf87bad6470
CPU threads: 2; OS: Linux 5.3; UI render: default; VCL: x11; 
Locale: en-US (en_US.UTF-8); UI-Language: en-US
Calc: threaded

but not in
4.2
Comment 1 Telesto 2020-06-08 18:27:46 UTC
Created attachment 161773 [details]
Example file
Comment 2 Telesto 2020-06-08 18:28:45 UTC
Created attachment 161774 [details]
Bibisect log

Bisected to
commit 89c92c5812cbfb70120b588d1f52d3d8dfcacce3
Author: Matthew Francis <mjay.francis@gmail.com>
Date:   Thu May 28 21:17:43 2015 +0800

    source-hash-f4a075728f62f0083a4bffd40d3c02265082d962
    
    commit f4a075728f62f0083a4bffd40d3c02265082d962
    Author:     Kohei Yoshida <kohei.yoshida@collabora.com>
    AuthorDate: Fri Apr 18 00:29:55 2014 -0400
    Commit:     Kohei Yoshida <kohei.yoshida@collabora.com>
    CommitDate: Wed Apr 23 21:08:20 2014 -0400
    
        Avoid using SwapRow() when sorting.
    
        Instead, build a data table prior to sorting, swap rows in this data table
        as we sort, then transfer the results back to the document in one step.  This
        significantly speeds up the sort performance.
    
        Formula cells are yet to be handled. I'll work on that next.
    
        Change-Id: I59bde1a243dc8940411d1a33eef147671b060cd0
Comment 3 Telesto 2020-06-08 18:29:59 UTC
And the memory usage is sticky too.. close the document.. still in use
Comment 4 Telesto 2020-06-08 18:37:33 UTC
Created attachment 161775 [details]
Bibisect log

The first bibisect is about bump from 89 to 430 mb

This bibisect is for the bump from 430 to 580

ommit ef183cd2ffc39d3d0baa53a2ce02763530b86129
Author: Matthew Francis <mjay.francis@gmail.com>
Date:   Thu May 28 21:17:52 2015 +0800

    source-hash-c72f76fcd1107a2e5542b9a43fc535914a210b17
    
    Bibisect: This commit covers the following source commit(s) which failed to build
    607b7ddeeb6c9d380adf67edf4ae7877ff3bdb0c
    bac1e2ddeb438b73556466a3014751c0f4f54960
    daaa3026774ba7b21c2b045c185171bb8fd6e551
    b99c91456491781556f89b9ad3e9c6150e7de3b2
    2e8c0c7076023573728489170e3d9d364aa6130c
    f1047b5e3bd819f76b54900b52d9ca1d2ed305a7
    aa3e2b7ae90c0fdad28dfd097a230e8ab4cb2565
    94cf534a89634290201141a08e19d156bb3b9a19
    832bee9aae88c30d2eea4c8fd0765e4a193cbe7b
    d053d40e86381cc4e7c7249e66530f5f4323b514
    46b35f349ef795d89e4f03fe7f72623ee105e669
    91261a246bd1c69b4b5ee62669e8a854c62bf3da
    68fccdc9872c1bf36b2851d58929d6cdcc2e2b2e
    bd532483b67e4fd2d1f26df545cc525de5522f10
    3f41b12c6685b82b5c2674bd9b9d5991adebeaf9
    09bf5e8093ffafa08cd3e8b22a7a792be70fba7c
    91303658fb33066c57aed04ac31fd3998f11f069
    
    commit c72f76fcd1107a2e5542b9a43fc535914a210b17
    Author:     Kohei Yoshida <kohei.yoshida@collabora.com>
    AuthorDate: Wed Apr 23 15:29:56 2014 -0400
    Commit:     Kohei Yoshida <kohei.yoshida@collabora.com>
    CommitDate: Wed Apr 23 21:08:26 2014 -0400
    
        Set mdds 0.10.3 as the new package requirement.
    
        Change-Id: Ide0e10fa528d53a7e732d00b54c940111beebe19

:040000 040000 794d8214a07008c5d90296cba968e839f7964984 36bbb850acf98abc2b23d305588eca506cca8933 M	opt
Comment 5 Telesto 2020-06-09 09:36:13 UTC
Created attachment 161792 [details]
Example file
Comment 6 b. 2020-06-25 00:39:32 UTC
no repro with below versions, linux only? 

found no data in col. F, tried sorting 'E' instead and sorting empty col. F, both less than 10 MB of mem-use, 

ver 6.1.6.3 and 7.1.0.0.a0+, both winx64,
Comment 7 Telesto 2020-06-25 10:17:35 UTC
Created attachment 162394 [details]
Screencast

Only about the peak
Version: 7.1.0.0.alpha0+ (x64)
Build ID: aadcd6f90916bd2b9734ae793141d0c77cc5b46c
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: default; VCL: win
Locale: nl-NL (nl_NL); UI: en-US
Calc: CL
Comment 8 b. 2020-06-25 12:16:58 UTC
Created attachment 162398 [details]
sorting col F and E with 7.1, win7, no mem peak,

hi @Telesto, 

thanks for the video, see behaviour, but not reproducible here, 

see attached screenshot, marks 1: load of calc 7.1.0.0.a01+, 2: close of program, inbetween load of file, nearly no mem usage, sort of col. F ascending, no visible mem usage, sort of col E ascending, no visible mem usage ... 

above with: 
Version: 7.1.0.0.alpha0+ (x64)
Build ID: a201ab6f47c2d5a7ba4c5f998b0aa231cae82010
CPU threads: 8; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-US
Calc: CL

selecting whole column and using [data | sort] with undo steps 100 - standard, 

similar with ver. 6.4.4.2 linx64 (debian/kali 2020.2), no visible impact or peak, 

thus: bug is, i believe in your video, but no repro here, still 'unconfirmed' ...
Comment 9 b. 2020-06-25 12:33:24 UTC
sorry, seen on recheck, 

you used [autofilter | sort], 

that indeed has some weaknesses, mem usage here short moment ~370 MB, not critical, but unneccessary as [data | sort] proves it can be done better, 

'new' but minor importance,
Comment 10 Telesto 2020-06-25 13:53:16 UTC
Hmm
Comment 11 b. 2020-06-26 05:43:38 UTC
i've put some bugs under 'see also', if someone is working in this area it might be appropriate to check for them. users - especially me - often have problems when similar functionalities act differently if they are reached in different ways. Especially when these differences happen 'hidden' without the user getting a hint. (in everyday life not every user has the time and in complex tables only few users have the possibility to check the results of an operation in detail).

in this case it is that 'sort' behaves differently when you call it via [data - sort] than when you activate it via the dropdown-button of an autofilter and then sort [as|des]cending. 

the selection of the range to which the sort is applied is probably done differently, i mean i saw a very nice hint menu in [data - sort] the other day 'there is data in adjacent ranges if you want to include it' or something like that, and users who accidentally shredded their data with [autofilter - sort] would have been much less annoyed if they had received a similar warning there. 

Of course, it is a complex task for the usability to strive for both consistency and compatibility with Excel, this may sometimes conflict, is there a 'general guideline' for this somewhere?
Comment 12 Xisco Faulí 2020-06-26 06:30:38 UTC
Removing perf keyword since this issue is about memory usage and not about performance
Comment 13 QA Administrators 2022-06-27 03:29:05 UTC Comment hidden (obsolete)
Comment 14 Tex2002ans 2024-02-09 04:18:37 UTC
Retested in:

Version: 24.2.0.3 (X86_64) / LibreOffice Community
Build ID: da48488a73ddd66ea24cf16bbc4f7b9c08e9bea1
CPU threads: 8; OS: Windows 10.0 Build 22631; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded

- - -

I followed the steps in Comment #7 video exactly:

~330 MB on initial load.
~836 MB peak while sorting.
~330 MB after sorting.