Bug 74650 - Performance issue with filtering a large data set via autofilter
Summary: Performance issue with filtering a large data set via autofilter
Status: RESOLVED DUPLICATE of bug 75058
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.0.4 release
Hardware: All All
: high critical
Assignee: Not Assigned
URL:
Whiteboard: target:4.3.0
Keywords: perf
Depends on:
Blocks:
 
Reported: 2014-02-07 06:31 UTC by Joachim Manke
Modified: 2015-12-15 11:37 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Large(?) csv data file (2.08 MB, text/csv)
2014-02-07 06:31 UTC, Joachim Manke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joachim Manke 2014-02-07 06:31:11 UTC
Created attachment 93584 [details]
Large(?) csv data file

After opening the (quite large) csv data file attached, work within Calc is nearly impossible (filtering, editing ...). The work load of the computer seems to be huge, if I try to copy all data to the clipboard (in Windows), this is even destroyed - no more clipboard function in any program until restart of Windows!!!

With 4.1.3.2 everything seems to work fine with the same data.
Comment 1 Jean-Baptiste Faure 2014-02-12 06:06:56 UTC
Seems to work as expected with development version 4.2.1.0.0+ under Ubuntu 13.10 x86-64. No problem to copy-paste the whole contains of the sheet to another one.

Please could you describe step by step what you mean by "work within Calc is nearly impossible (filtering, editing ...)." What you try to do exactly?

Set status to needinfo. Please do not set your own bug report to new, each bug report must be confirmed independently. So please set it back to unconfirmed once the requested information have been provided.

Thank you for your understanding.
Best regard. JBF
Comment 2 Joachim Manke 2014-02-14 20:48:17 UTC
Hi Laurent,

I have done some measurements as well on Windwows 7 64-bit as on Mint 16 64-bit (which is an buntu variant in fact). Here are my results:
Windows 7 64-Bit, Intel i7, 16GB RAM

Task                                                       Duration  (s)
                                              Libreoffice 4.1.3.2      Libreoffice 4.2
_________________________________________________________________________________________
Open file                       	           5                          3
Autofilter in all columns 		           0                          6!
Mark all                                           0                          0
Copy                                               0                         14!
Filter to "non empty" in one column                0              8=3 (choice)+5(do)!
Mark all                                           0                          4!
Open Excel                                         0                         40!!!!!!!

With Excel 10 open simultaniously

Open file                                         10                          3 (very good)
Autofilter in all columns                          0                          7!
Mark all                                           0                          0
Copy                                               0                         55!!!!!
Filter to "non empty" in one column                0                    8=3 (choice)+5(do)!
Switch to Excel                                    0                         20 !!!!
Insert data                                        0                         45!!!!!! 


Mint 16 64bit, AMD X4 955, 8GB RAM

Task                                                 Duration  (s)
                                          Libreoffice 4.1.3.2      Libreoffice 4.2
____________________________________________________________________________________________________
Open file                       	            17!                  2 (excellent)
Autofilter in all columns 		            0                   42!
Mark all                                            0                    0
Copy                                                0                    0
Filter to "non empty" in one column                 0             37=14(choice)+23(do)!!!!!
Mark all                                            0                    7!

I hope, that it is clear now, what I mean with "serious performance problem". 
Hope, that can be fixed soon!

Best regards

Jo
Comment 3 Jean-Baptiste Faure 2014-02-16 16:17:12 UTC
I reproduced the performance regression between 4.1 and 4.2 for "Filter to "non empty" in one column" with the csv bugdoc under Ubuntu 13.10 x86-64. I tested with LO 4.1.5 and 4.2.2.0.0+ (build at home).

Set status to NEW.

Best regards. JBF
Comment 4 Matt Hurd 2014-02-18 12:00:42 UTC
Confirming similar performance regressions on Ubuntu 12.04 LTS with 4.2.1~rc1-0ubuntu1~precise1.  A 200k line csv import used to be snappy enough and now very slow on entering formulae and filling down all 200k rows.  Gave up waiting for a sort on one column after many minutes passed (showing >40 minutes of compute time elapsed).

Kind regards,

--Matt.
Comment 5 pinus 2014-03-30 18:13:32 UTC
I can second that. I read that 4.2 should be faster with large data and found it to be nearly useless. I always get a gray window, GUI thread blocked. Couldn't even open a writer document while calc is doing it's things.

It looks if nobody tested it with more than trivial data. Use the file appended to bug 71582 and copy all and append it to the same table. Activate auto filtering and browse the table.

Save the table, close calc, open it again. 

That everything is so slow is a shame but that the GUI is blockt is inacceptable.
Comment 6 Alexey Zinoviev 2014-04-14 19:23:19 UTC
Confirm slowness with filters on LO 4.2.3.3 x86. 
Easy to reproduce - just fill 9999 cells with random generator (btw there is a bug too - does not fill manual range on new sheet) with big discrete numbers from 0 to 1000000000. Calc hangs for 2-3 minutes and autofilter window does not open. If you fill 9999 cells with numbers from 0 to 10 it will open up in a second. excel 2013 on that file with both "big" and "little" numbers open filter window in a second too.
That was fresh setup on win8 where i've tryed to work with real calc file. With default settings process memory of soffice.bin were up to 198mb. Then i've changed mem settings (like internet propose) in options to maximum 256mb and process mem was not bigger than 99mb on same file. Perfomance on filtering different data (names, phones etc) was bad always - filter took about 3-5 minutes to open autofilter window or even not open.
Comment 7 Kohei Yoshida 2014-04-29 18:53:33 UTC
(In reply to comment #5)

> It looks if nobody tested it with more than trivial data.

BTW, that "nobody" includes you too.  This is a community-led effort.  If you are insinuating that we don't do shit then you are as much to blame here.

Geez.
Comment 8 Commit Notification 2014-04-29 19:36:55 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=878a5dabff4669fb606a461e11eaf286d0c8b07f

fdo#74650: Speed up GetFilteredFilterEntries().



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Kohei Yoshida 2014-04-29 19:39:38 UTC
That change should help Column B:J, but Column A needs another solution.  It's another manifestation of the poor perf issue with SvTreeListBox.
Comment 10 Kohei Yoshida 2014-04-30 01:37:03 UTC

*** This bug has been marked as a duplicate of bug 75058 ***
Comment 11 Robinson Tryon (qubit) 2015-12-15 11:37:50 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]