Bug 80853 - Calc freezes while filtering large data
Summary: Calc freezes while filtering large data
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.5.2 release
Hardware: All All
: high major
Assignee: Luboš Luňák
URL:
Whiteboard: target:6.3.0
Keywords: bibisected, perf, regression
Depends on:
Blocks: Data-Filter multi_type_vector-regressions
  Show dependency treegraph
 
Reported: 2014-07-03 13:26 UTC by Vitaly Bevsky
Modified: 2020-01-02 03:38 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
test file (2.75 MB, application/x-7z-compressed)
2014-07-03 13:31 UTC, Vitaly Bevsky
Details
callgrind profiles for all three issues (2.63 MB, application/zip)
2016-04-27 21:28 UTC, Markus Mohrhard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vitaly Bevsky 2014-07-03 13:26:41 UTC
The file has >500k rows. 

Steps to reproduce.
1. Open file
2. Open standart filter - Calc freezes for a few minutes.
3. Specify a filter (for example, column 'A' = '888-001695' - Calc will freez forever
Comment 1 Vitaly Bevsky 2014-07-03 13:31:42 UTC
Created attachment 102199 [details]
test file
Comment 2 Jens S 2014-07-03 15:35:34 UTC
Column A is a messed up with text and values. If you set all values to text in this column - your filter-operations will work OK.
Apparently LibreOffice don't like different formats in the filter-column, and I don't know if this is considered as an error.
Comment 3 Vitaly Bevsky 2014-07-08 07:03:59 UTC
If it is not possible to improve filter' speed, Calc may show message of slow speed
Comment 4 Joel Madero 2014-07-20 04:56:22 UTC
Well that was a miserable bug to bibisect - but it's done.

Confirmed
Ubuntu 14.04 x64
LibreOffice 4.2.5 release

Works fine in 4.1.x

Marking as:
New
Major - can prevent high quality/professional work - it essentially is a full freeze that any user would just quit and lose data in the end
High - regression

Would be nice to know if this is only with csv files.


Bibisect Below: 83a62c1c1e8e259144e489d9a1f42611eba063c3 is the first bad commit
commit 83a62c1c1e8e259144e489d9a1f42611eba063c3
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Thu Oct 17 14:30:14 2013 +0000

    source-hash-022c54742e7997bf46a608f1ab0b500f2537f7f5
    
    commit 022c54742e7997bf46a608f1ab0b500f2537f7f5
    Author:     Tor Lillqvist <tml@iki.fi>
    AuthorDate: Tue Jun 25 07:19:41 2013 +0300
    Commit:     Tor Lillqvist <tml@iki.fi>
    CommitDate: Tue Jun 25 07:19:41 2013 +0300
    
        WaE: private field 'mrCells' is not used
    
        Change-Id: I0ab3fabb82c839f5194b0e20eb834dd86635a609

:100644 100644 4b10c5c8ddbedca0971e0839a8acc603792a447c 483b58760a06de929b32eafde25a67466c622502 M	ccache.log
:100644 100644 54c63dd94c275598f317bb54ddfdd27aaad5d8a1 fcfaf4eddaf5f8c7a66f90a052cbf2c7473cdc9b M	commitmsg
:100644 100644 e607019f9ceabe4513be6de63f5724c67ece57f9 3e023e83e964fd4b90d7bdf45eab489c7382956c M	dev-install.log
:100644 100644 2d16d57e331ca5fab2ec46ad12fe030528c544bb 47ead046b9af75e2384d8d8f51767edfa54d5dc8 M	make.log
:040000 040000 3aaab4081e7400904dc31731c74182db7e18493c 82a20807f2d069e8294cfa6e30778214a869a341 M	opt

# bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e
# good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932
git bisect start 'latest' 'oldest'
# good: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb
git bisect good e02439a3d6297a1f5334fa558ddec5ef4212c574
# bad: [4850941efe43ae800be5c76e1102ab80ac2c085d] source-hash-980a6e552502f02f12c15bfb1c9f8e6269499f4b
git bisect bad 4850941efe43ae800be5c76e1102ab80ac2c085d
# skip: [a043626b542eb8314218d7439534dce2fc325304] source-hash-9379a922c07df3cdb7d567cc88dfaaa39ead3681
git bisect skip a043626b542eb8314218d7439534dce2fc325304
# skip: [aba65c3e4c0df07e4909aeefb758cdb688242bf6] source-hash-827524abfb4b577d08276fde40929a9adfb7ff1a
git bisect skip aba65c3e4c0df07e4909aeefb758cdb688242bf6
# bad: [c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31] source-hash-c69ed33628ec0b7abf6296539cf280d6c4265930
git bisect bad c81a8a0dcfc1ed095a80e4485c89dd0fcaf73f31
# bad: [1d4980621741d3050a5fe61b247c157d769988f2] source-hash-89d01a7d8028ddb765e02c116d202a2435894217
git bisect bad 1d4980621741d3050a5fe61b247c157d769988f2
# bad: [ba096f438393091574da98fe7b8e6b05182a8971] source-hash-8499e78ca03c792f4fa2650e02b519094ba0baa8
git bisect bad ba096f438393091574da98fe7b8e6b05182a8971
# bad: [9daa289e178460daaafa4b3911031df5b8736218] source-hash-704292996a3731a61339b1a4a5c90c9403aa095f
git bisect bad 9daa289e178460daaafa4b3911031df5b8736218
# good: [69bf614869471f46413fe1d2af5976b2e6d85084] source-hash-76dea8b2db906156e77f78738a68f932a15afd4b
git bisect good 69bf614869471f46413fe1d2af5976b2e6d85084
# good: [502c05c771cd993b237febc2d8a20140fe589488] source-hash-462df4920ef50032c8f99a9db2ca34c9cc928657
git bisect good 502c05c771cd993b237febc2d8a20140fe589488
# bad: [567bfa79fb5ad4f9dfa05f0dea7666208d6129b2] source-hash-4d5fc661d37d03129b8054e494c03bed1933231d
git bisect bad 567bfa79fb5ad4f9dfa05f0dea7666208d6129b2
# good: [7d878017eaa2fc1d2eab72689a5e453622d474a2] source-hash-b139f6fedfcf3cbed0eadeb007e2155b576413d2
git bisect good 7d878017eaa2fc1d2eab72689a5e453622d474a2
# bad: [83a62c1c1e8e259144e489d9a1f42611eba063c3] source-hash-022c54742e7997bf46a608f1ab0b500f2537f7f5
git bisect bad 83a62c1c1e8e259144e489d9a1f42611eba063c3
# first bad commit: [83a62c1c1e8e259144e489d9a1f42611eba063c3] source-hash-022c54742e7997bf46a608f1ab0b500f2537f7f5
Comment 5 Vitaly Bevsky 2014-07-20 05:58:06 UTC
I have never seen this bug in any files but csv.
Comment 6 Matthew Francis 2015-01-04 13:21:27 UTC
The behaviour changed somewhere in the patch set c7bdee8dbd1cf260a8513a0d31b36f90daa70f1c..4c99a427ee4adaeddb2682c192384bad21d9d09b
which begins as below. Building the commits individually doesn't seem to work, so a single commit can't easily be identified.


commit c7bdee8dbd1cf260a8513a0d31b36f90daa70f1c
Author: Kohei Yoshida <kohei.yoshida@gmail.com>
Date:   Wed May 22 20:27:24 2013 -0400

    Define block types for string, edit text and formula cell elements.
    
    Also, remove the custom_ prefix from block names.
    
    Change-Id: If3dfdbdacc2d0113fa8d631bec7a914b51668115
Comment 7 Robinson Tryon (qubit) 2015-12-10 09:46:21 UTC Comment hidden (obsolete)
Comment 8 Markus Mohrhard 2016-04-27 21:27:38 UTC
So this file exposes about 3 performance problems.

Selecting the whole range with content takes much more time than necessary in script type functions.

Opening the dialog takes a long time in sorting all strings.

Filtering takes a long time in stupid mdds access.
Comment 9 Markus Mohrhard 2016-04-27 21:28:20 UTC
Created attachment 124683 [details]
callgrind profiles for all three issues
Comment 10 QA Administrators 2018-09-15 03:09:56 UTC Comment hidden (obsolete)
Comment 11 Commit Notification 2019-05-15 11:37:58 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/3e664b8f194392eb27aae953c0d33a8bdfd32982%5E%21

cache cell positions when opening standard filter in calc (tdf#80853)

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2019-05-16 10:35:34 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/ace16e500c92797bb47ad580cf535de0702137bd%5E%21

cache mdds access in ScTable::ValidQuery() (tdf#80853)

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Xisco Faulí 2019-05-20 10:36:55 UTC
(In reply to Markus Mohrhard from comment #8)
> So this file exposes about 3 performance problems.
> 
> Selecting the whole range with content takes much more time than necessary
> in script type functions.

in

Version: 6.3.0.0.alpha1+
Build ID: 9c7fac47aacb0877c7d212217089a680400c1377
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded

it takes 4 seconds

> 
> Opening the dialog takes a long time in sorting all strings.

it takes 10 seconds

> Filtering takes a long time in stupid mdds access.

it takes 2 seconds

Setting to VERIFIED

@Luboš Luňák, thanks for fixing this issue!
Comment 14 Xisco Faulí 2019-05-20 10:44:27 UTC
I found LibreOffice hangs when I try to show the value dropdownlist from the dialog.
I'll report it in a follow-up bug