Bug 72741 - UI: Sort performance ~17 times slower than version 4.0.6.2
Summary: UI: Sort performance ~17 times slower than version 4.0.6.2
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.0.4 release
Hardware: All All
: highest normal
Assignee: Not Assigned
URL:
Whiteboard: BSA target:4.3.0 NoRepro:4.3.0.0a0+:O...
Keywords: perf, regression
Depends on:
Blocks: mab4.1
  Show dependency treegraph
 
Reported: 2013-12-16 04:12 UTC by Fred Polizo
Modified: 2015-12-15 11:17 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
valgrind console output (8.79 KB, text/plain)
2014-01-03 08:14 UTC, Jean-Baptiste Faure
Details
Fred's valgrind console log (1.91 KB, text/plain)
2014-01-03 10:51 UTC, Fred Polizo
Details
Image of partially displayed Sort dialog box on 4.2.5.2 prerelease (215.38 KB, image/jpeg)
2014-06-15 11:24 UTC, Fred Polizo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Fred Polizo 2013-12-16 04:12:18 UTC
Problem description:

The time it takes to do even a simple single-column sort of my (attached) spreadsheet takes ~17 times longer on version 4.1.3.2 as compared to version 4.0.6.2. 

The spreadsheet contains 13 columns by 62,712 rows of "General" formatted cells that contains plain alphanumeric values.

I initially noticed the performance problem when multi-column sorting using Data>>Sort..., but then noticed the problem when sorting just one column. So, I devised a simple single column test case by adding a 14th column of =RAND() formulas to my spreadsheet making it easier/faster to test. 

Using the 'Sort Ascending' toolbar button to sort the random column yields the following times on my HP Envy dv6 AMD A8-Quad-Core Laptop w/6GB RAM (on openSUSE 13.1 64-bit):

  LibreOffice_4.0.6_Linux_x86-64:  18 seconds
  LibreOffice_4.1.3_Linux_x86-64: 308 seconds  


Steps to reproduce:
1. Download and install these two packages from http://www.libreoffice.org/download/ :

  LibreOffice_4.0.6_Linux_x86-64_rpm.tar.gz
  LibreOffice_4.1.3_Linux_x86-64_rpm.tar.gz

2. Download the attached sample spreadsheet (62krows.obs)

3. Execute the following command to load the spreadsheet with 4.0.6.2:
       
   /opt/libreoffice4.0/program/scalc 62krows.ods

4. After the spreadsheet is fully loaded, select a cell in column A ("Rand").

5. Using a "stopwatch" (or clock with second hand) measure the time it takes for the sort operation to complete; start timing when you click the "Sort Ascending" button in the toolbar. Record the elapsed time when the operation completes.

6. Execute the following command to load the spreadsheet with 4.1.3.2:
       
   /opt/libreoffice4.1/program/scalc 62krows.ods

7. After the spreadsheet is fully loaded, select a cell in column A ("Rand").

8. Using a "stopwatch" (or clock with second hand) measure the time it takes for the sort operation to complete; start timing when you click the "Sort Ascending" button in the toolbar. Record the elapsed time when the operation completes.


Current behavior:

  LibreOffice_4.0.6_Linux_x86-64:  18 seconds
  LibreOffice_4.1.3_Linux_x86-64: 308 seconds  

Expected behavior:
  LibreOffice_4.0.6_Linux_x86-64:  18 seconds
  LibreOffice_4.1.3_Linux_x86-64:  18 seconds (or less) 
              
Operating System: openSUSE
Version: 4.1.3.2 release
Comment 1 Fred Polizo 2013-12-16 05:03:53 UTC
The spreadsheet was a little too big for an attachment. Here's a URL where it can be downloaded:
   http://bugs.fpchico.com/62krows.ods
Comment 2 retired 2013-12-16 10:46:47 UTC
Opens in 15 seconds on LO 4.3.0.0a0+ (from yesterday) OSX 10.9.
Comment 3 Fred Polizo 2013-12-16 12:31:37 UTC
@Foss
Do you mean it *sorts* in 15 seconds with 4.3?  I reported a issue with sorting, not opening the file.
Comment 4 Jean-Baptiste Faure 2013-12-22 09:20:21 UTC
Reproducible for me under Ubuntu 13.10 x86-64 (CPU : i7) with versions 4.1.4 (~5 minutes) and 4.2.0.1+ (~15 minutes !!!). 
I do not understand why doing a sort on a column with =alea() formula: the formula being computed again, the sort sorts nothing.
That said, we have the same problem if the sort is done on column B.
If I remove the formulas in column A and replace it by a number, I still get the same performance regression.

Best regards. JBF
Comment 5 Fred Polizo 2013-12-22 13:07:49 UTC
@JBF
I agree. I added column A only to try to make the test case simpler. It seems it just caused confusion. Sorry for that. Please feel free to ignore column A (or remove it) and focus on the sort performance of column B, as you stated in your comment.

Again, I first noticed the problem doing a multi-column sort on columns B and C, but then noticed the same problem with column B alone.   

Thank you for your efforts.
Comment 6 Michael Meeks 2013-12-31 22:00:19 UTC
Looks like this should be marked as a LibreOffice 4.1 regression.

The problem persists in master / 4.2 - it is no doubt some very silly issue =)

If you can install a debuginfo package set for your packages on Linux, and then do:

pkill -9 -f soffice.bin
export OOO_DISABLE_RECOVERY=1
valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin --splash-pipe=0 /path/to/your/test-file.ods

Wait for it to load - it runs around 150x slower - and then select / click the sort button - I guess you'll need to leave that overnight. Exit cleanly in the morning - and you should have a beautiful callgrind.12345.txt file that you can gzip2 and upload that should make fixing this extremely trivial for the developers =)

If you can do that, I'll have a go at fixing the issue - thanks ! =)
Comment 7 Jean-Baptiste Faure 2014-01-02 09:00:20 UTC
(In reply to comment #6)
> Looks like this should be marked as a LibreOffice 4.1 regression.
> 
> The problem persists in master / 4.2 - it is no doubt some very silly issue
> =)
> 
> If you can install a debuginfo package set for your packages on Linux, and
> then do:
> 
> pkill -9 -f soffice.bin
> export OOO_DISABLE_RECOVERY=1
> valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes
> ./soffice.bin --splash-pipe=0 /path/to/your/test-file.ods
> 
> Wait for it to load - it runs around 150x slower - and then select / click
> the sort button - I guess you'll need to leave that overnight. Exit cleanly
> in the morning - and you should have a beautiful callgrind.12345.txt file
> that you can gzip2 and upload that should make fixing this extremely trivial
> for the developers =)

Hi Michael, thank you for the instructions. Running :-)

Best regards. JBF
Comment 8 Michael Meeks 2014-01-02 11:12:20 UTC
I imagine we are doing in-place sorting by swapping inside each column, it is probable that we need to re-implement that to sort into a separate index, and then re-build that part of the sheet in a better way somehow [ no doubt there are a tremendous number of corner-cases to handle ].

Then again - possibly if we're exchanging rather similar entries in the column we can write some accelerated swapping that will fix the problem without a lot of re-work, the profile should show that.

Thanks !
Comment 9 Jean-Baptiste Faure 2014-01-03 08:14:33 UTC
Created attachment 91451 [details]
valgrind console output

After ~20 hours of computation I interrupted valgrind. The compressed log is here : http://www.hightail.com/download/elNMeW41MHdreEFYRHNUQw

The console output is attached.

Hope this helps.

Best regards. JBF
Comment 10 Jean-Baptiste Faure 2014-01-03 08:15:17 UTC
set status back to NEW.

Best regards. JBF
Comment 11 Jean-Baptiste Faure 2014-01-03 10:17:56 UTC
Hi Markus and Kohei,

This performance bug may be interesting for you.

Best regards. JBF
Comment 12 Fred Polizo 2014-01-03 10:51:35 UTC
Created attachment 91455 [details]
Fred's valgrind console log

After installing the openeSUSE 13.1 libreoffice*debuginfo packages, I too had a try at running valgrind while sorting my test spreadsheet on two keys (column B and column C). 

The sort completed successfully and I exited soffice.bin cleanly 
(real    287m16.314s).  The console output is attached and you can download the compressed output file at
    http://bugs.fpchico.com/callgrind.out.14583.bz2

Not sure if you need/want another test case since JBF just provided one, but since I had it, I thought I'd make it available to you. 

Hope this helps,
Fred P.

Debuginfo packages installed:
libreoffice-base-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-base-drivers-mysql-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-calc-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-draw-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-impress-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-impress-extensions-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-kde4-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-math-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-pyuno-debuginfo-4.1.3.2-4.2.x86_64
libreoffice-writer-debuginfo-4.1.3.2-4.2.x86_64
Comment 13 Michael Meeks 2014-01-03 11:45:00 UTC
Thanks guys, the profiles are extremely clear as to where the problem is =) It'll be about a week before we can find anyone to look into this I'm afraid though.
Comment 14 Björn Michaelsen 2014-01-17 09:51:51 UTC
(This is an automated message.)

Setting priority to highest as this is a 4.1 MAB. This is part of an effort to make the importance of MAB reflected in priority too.
Comment 15 jm 2014-03-14 10:58:04 UTC
yes, i confirm, still in 4.2.3 rc1 with windows 7

we used to have spreadsheet with more than 100000 rows, they are unusable now (up to 25 minutes for a single column sort)
Comment 16 Jean-Baptiste Faure 2014-03-14 11:14:28 UTC
Please, do not change the version number that shows the oldest version in which the problem has been seen.

Best regards. JBF
Comment 17 Michael Meeks 2014-03-14 12:23:55 UTC
JBF's nice profile shows that some huge chunk of the time is spent inside:

SvtBroadcaster::Remove -> SvtBroadcaster::Normalize - which was called taking 84% of the cost of the sort.

I guess this is the new broadcaster speedup for import having a particularly horrible performance impact on sorting =)

We could of course fix that ;-) but it's frustrating to see all that junk coming from: ScColumn::SwapRow -> ScColumn::DetachFormulaCell - when all the formulae are (inevitably) in a formula-group anyway =)

A quick/awful hack to check / workaround if it is the formule causing the grief, would be to move formulae to the far right side of the sheet - and sort only the data they refer to on the left; I'd be interested if that improves things for people.

Quite possibly - we should detect that the two rows are in the same formula group, and just not swap the ScFormulae at all =)
Comment 18 Michael Meeks 2014-03-14 12:28:56 UTC
Sorting a million rows of just numbers => 5 seconds on my hardware; adding =A1 to column B1 - and re-sorting, takes significant time =) [ still waiting ].
Comment 19 Michael Meeks 2014-03-14 14:27:56 UTC
I've pushed a patch for review here:

https://gerrit.libreoffice.org/8590

That allows me to sort 1 million rows containing a simple repeated formula column (=A1) in sub 10 seconds (vs. too long to wait for before).

Of course - no doubt it breaks everything too - I need to write a unit test or three; but input / review / testing much appreciated.
Comment 20 Michael Meeks 2014-03-14 15:05:37 UTC
Hmm, testing that patch seems to suggest that it has no impact on this particular sheet. I guess I was mislead by jbf's profile - prolly done with a different version and optimised the wrong case.

Either way - examining the full profile; it seems we swap each row only once which is good: I suspect that the problem here comes from the 'holes' in the data - ie. missing entries down the column in various places that fragments the MDDS / column storage and causes copying / re-allocation on swapping.

It would be really good to have a callgrind profile of a recent 4.2 build with debuginfo to be able to do more here =) any takers ?
Comment 21 Commit Notification 2014-03-14 16:36:22 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0698c49ccdbf62dd84d3f9c5d25ee039f4fff722

fdo#72741 - swap values only inside a formula group.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2014-03-14 16:36:44 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ff89c120f5f5036c3792b9cfc3f0329de3ac0a43

fdo#72741 - write unit test for in formula group swapping.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 23 Michael Meeks 2014-03-14 16:38:34 UTC
It'd be even better to have a profile with that patch included, if at all possible - it at least eliminates one potential problem =) Thanks !
Comment 24 Michael Meeks 2014-03-15 20:25:11 UTC
I got impatient, and am running the profile myself =) I suspect it will be necessary to re-build the sheet structures and insert the data again rather than do the swapping to speed this use-case up; either that - or find a way to map small runs of empty cells in columns to entries in a string / double table - NULL's or some magic string perhaps =) [ might help speed other things up too ]. But lets see ...
Comment 25 Michael Meeks 2014-03-17 13:41:33 UTC
http://users.freedesktop.org/~michael/fdo-72741-sorting-ff89c120f5f5036c3792b9cfc3f0329de3ac0a43-2.txt.bz2

ScColumn::SwapCellTextAttrs -> ~30% of the time.
ScTable::SetPattern -> 10% of the time.

Seems like we get a lot of MDDS thrash from these above - which is odd really - AFAICS the text has no patterns / text attributes to talk about in that data =)
Comment 26 Kohei Yoshida 2014-04-17 03:31:53 UTC
I'll look at this together with Bug 76607.  The two need to be fixed together.
Comment 27 Kohei Yoshida 2014-04-23 03:17:45 UTC
Just a little update.  The sorting with the submitted test document now finishes in 7.5 seconds on my local machine.  Originally it took 263 seconds (~= 4.4 minutes).  Now I'll just have to do some cleanup and write some tests.
Comment 28 Kohei Yoshida 2014-04-24 01:09:48 UTC
Just merged a whole bunch of changes to master that should fix this.

For 4.2: https://gerrit.libreoffice.org/9144
Comment 29 Commit Notification 2014-04-25 12:16:21 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=08b51bfff61870cd573daea914dce429743f4655&h=libreoffice-4-2

fdo#72741, fdo#76607: Backport of Calc sort rework.


It will be available in LibreOffice 4.2.5.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 30 Kohei Yoshida 2014-04-25 13:01:54 UTC
While this one is an MAB for 4.1, backporting my fix to 4.1 is not possible since so much has changed between the two versions.  Is it okay to mark this as fixed for 4.2?
Comment 31 Michael Meeks 2014-04-25 16:18:30 UTC
> Is it okay to mark this as fixed for 4.2 ?

Sounds good to me - thanks so much for your hard work here Kohei ! =) if we know it's basically impossible / super-hard to fix for 4.1 and won't get done lets just close it now rather than letting it linger.
Comment 32 Jean-Baptiste Faure 2014-04-25 18:02:02 UTC
(In reply to comment #30)
> While this one is an MAB for 4.1, backporting my fix to 4.1 is not possible
> since so much has changed between the two versions.  Is it okay to mark this
> as fixed for 4.2?

It is ok for me. I just tested your fix on version 4.2.5.0.0+ under the same PC as the one used for comment #4 (but this time under Ubuntu 14.04) and sorting took a few seconds to complete.

Thank you very much Kohei !

Best regards. JBF
Comment 33 Fred Polizo 2014-06-15 11:17:47 UTC
I downloaded prerelease version LibreOffice_4.2.5.2_Linux_x86-64_rpm.tar.gz (Version: 4.2.5.2 Build ID: 61cb170a04bb1f12e77c884eab9192be736ec5f5) and did a "parallel install" on the same HP laptop used for the original bug report.  I tried sorting the same test file as before (http://bugs.fpchico.com/62krows.ods) and still see performance problems, but better and different than before.  

After I selected the "Data >> Sort..." menu item it took ~144 seconds for the Sort dialog box to be properly displayed. (I'll attach a screen capture of the partially rendered dialog box.) The program was unresponsive to mouse and keyboard events during this period. After ~144 seconds, the Sort dialog was updated with the proper sort related controls and the program became responsive again. 

At this point, I selected Sort Key 1 as "Surname" and Sort Key 2 as "Given Name" then clicked OK. The sort itself completed in ~7 seconds!

Note, each subsequent selection of the "Data >> Sort..." menu item took ~50 seconds before Sort dialog box was fully displayed. 

I've tried this test multiple times and the delays are repeatable. The ~7 second sort time is great, but the long waits for the Sort dialog box to be available are a problem.
Comment 34 Fred Polizo 2014-06-15 11:24:58 UTC
Created attachment 101092 [details]
Image of partially displayed Sort dialog box on 4.2.5.2 prerelease

After selecting Data>>Sort... menu item, it took 144 seconds to fully display the Sort dialog box. This is a screen capture of the partially rendered dialog box as it appeared during those ~144 seconds.
Comment 35 Kohei Yoshida 2014-06-15 21:40:18 UTC
Fred, please file a new bug.
Comment 36 Fred Polizo 2014-06-16 12:01:09 UTC
Why a new bug report? I very much appreciate your efforts to fix this bug, but the reported performance problem, while somewhat improved, was not fixed. Why not just reopen the bug so that context is not lost?
Comment 37 Kohei Yoshida 2014-06-16 12:08:19 UTC
If a new bug is not filed, it will be ignored.
Comment 38 retired 2014-06-16 22:54:39 UTC
The rule is: one bug per issue.

Kohei did fix a few things in this regard. He kindly asked for a new bug for any remaining issues. So I suggest, if you are still unhappy with the performance of the latest LO nightly or 4.3beta2 to indeed do that.

If performance is still slow, there may be other issues caused by other things that need fixing. Result may be similar, yet those are still other issues.
Comment 39 Robinson Tryon (qubit) 2015-12-15 11:17:20 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]