Bug 48312 - Performance regression in LibreOffice spreadsheet 3.5.2 vs. 3.4.5
Summary: Performance regression in LibreOffice spreadsheet 3.5.2 vs. 3.4.5
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: x86-64 (AMD64) Linux (All)
: high minor
Assignee: Not Assigned
URL:
Whiteboard: target:4.0.0
Keywords: perf
Depends on:
Blocks:
 
Reported: 2012-04-04 17:22 UTC by rlk
Modified: 2015-12-15 11:31 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Spreadsheet demonstrating performance regression (enter a number at 'All Results'.A746) (2.82 MB, application/vnd.oasis.opendocument.spreadsheet)
2012-04-04 17:39 UTC, rlk
Details
Spreadsheet demonstrating much better performance (enter a number at 'All Results'.A746) (2.74 MB, application/vnd.oasis.opendocument.spreadsheet)
2012-05-30 11:21 UTC, rlk
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rlk 2012-04-04 17:22:18 UTC
Entering new data into the attached spreadsheet (rowing.ods) is substantially slower in 3.5.2 than 3.4.5.  To reproduce, enter 10000 at 'All Results'.A746 and tab to B746.  With 3.4, the data was entered instantly.  With 3.5.2 (OpenSUSE 12.1 RPMs), I have to wait 5-10 seconds.
Comment 1 rlk 2012-04-04 17:39:54 UTC
Created attachment 59497 [details]
Spreadsheet demonstrating performance regression (enter a number at 'All Results'.A746)
Comment 2 rlk 2012-05-24 09:22:08 UTC
Issue still present with 3.5.3.
Comment 3 rlk 2012-05-24 12:32:36 UTC
Note that even applying a background color to a single cell that nothing depends upon is slow.
Comment 4 rlk 2012-05-30 11:21:35 UTC
Created attachment 62296 [details]
Spreadsheet demonstrating much better performance (enter a number at 'All Results'.A746)

This spreadsheet is much faster (takes just a second or so to enter data).

The only change is to $Splits.AD2:AM753 and $Splits.BH2:BQ753.  Rather than do an actual match against the data in columns G:AA, it simply uses the value in column G.  That is not correct behavior, but it may help demonstrate the problem.
Comment 5 bfoman (inactive) 2012-08-31 09:11:50 UTC
(In reply to comment #0)
> To reproduce, enter 10000 at 'All Results'.A746
> and tab to B746.  With 3.4, the data was entered instantly.  With 3.5.2
> (OpenSUSE 12.1 RPMs), I have to wait 5-10 seconds.

Confirmed with:
LO 3.5.6.2 
Build ID: own W7 debug build
Windows 7 Professional SP1 64 bit

It takes few seconds to apply changes.
Comment 6 Joel Madero 2012-11-21 19:40:41 UTC
Confirmed, LibO 3.6.3.2

Marking as NEW, adding regression, prioritizing

Minor: Doesn't prevent high quality work, not even a long delay, a few seconds

High: Regression
Comment 7 rlk 2012-11-22 02:14:15 UTC
Note that even if automatic recalculation is turned off there's still a delay.

(With other stuff I've since added to the spreadsheet, the delay has grown considerably longer, but I don't think it will allow me to attach a 5+ MB spreadsheet!)
Comment 8 Joel Madero 2012-11-25 04:22:43 UTC
who added perf to whiteboard status? I've never seen that status and want to make sure that we're following LibO guidelines.
Comment 9 Markus Mohrhard 2012-11-25 12:42:47 UTC
(In reply to comment #8)
> who added perf to whiteboard status? I've never seen that status and want to
> make sure that we're following LibO guidelines.

Most likely it was Kohei or me. At least in calc it is used to get an overview for performance problems. It is for them even better than regression because performance regressions are different from normal regressions.

It is also document on http://wiki.documentfoundation.org/BugReport_Details#Whiteboard by Rainer.
Comment 10 Joel Madero 2012-11-25 15:38:26 UTC
Should have looked first ;) Thanks Markus for clarification, now I have another term I can add to the list that I look out for.
Comment 11 bfoman (inactive) 2012-11-25 20:19:52 UTC
(In reply to comment #8)
> who added perf to whiteboard status? I've never seen that status and want to
> make sure that we're following LibO guidelines.

You can always check that viewing bug history (available near Modified above).
Comment 12 Joel Madero 2012-11-25 21:28:10 UTC
right when I feel like I have figured out fdo a bit ;) Thanks bfoman, another good thing to learn. 

Now the question for me is, is there a way for me to pull this data for all bugs? That would be quite useful for stats purposes :) (I know a bit off topic...my apologies and if there is an answer please email me)

Would a backtrace help, I'll get one together if it'll be useful
Comment 13 Eike Rathke 2012-11-26 17:46:49 UTC
Problem is due to the massive amount of OFFSET() functions used that need to be recalculated on every change. It looks as if at least in the range Splits.AD:BQ the calls like
OFFSET($G745,0,MATCH(AD$1,$G745:$AB745,0))
could be replaced by HLOOKUP if the result rows were not organized the way they are. Currently they are paired keys/values in one row for keys of 5:00, 10:00, ... 60:00; splitting keys and values into two rows each instead would allow to use, for example HLOOKUP($G745,$G745:$AB746,2,0)
Comment 14 Joel Madero 2012-11-26 17:48:25 UTC
But did something change to make it so much slower? I mean we can say that there is a more efficient way to do the same tasks but it still doesn't really answer why there has been a regression for performance
Comment 16 rlk 2012-11-26 18:08:25 UTC
Splitting into two rows won't be feasible for this purpose -- the rows here need to track the rows on the first sheet of the spreadsheet.  And there are a lot more uses of offset where the assumed ordering with the lookup functions would be a non-starter.

This explains why OpenOffice.org is also slow now (faster than LibreOffice, but that might be due to the OpenSUSE packages I'm using for the latter).
Comment 17 Not Assigned 2012-11-26 22:29:16 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=179b568a82e1506dced0d2f94c607f7bee2459fe

easy performance improvement related to fdo#48312



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Michael Meeks 2012-12-18 14:41:10 UTC
I guess OFFSET can't be calculated idly since it affects dependencies dramatically; so in fixing the previous bug around not re-computing them we got a perf. regression - it persists in 4.0. We need a faster calculation core I guess :-)

rlk - if you have a build of 4.0 with symbols it might be worth building a callgrind trace of this:

export OOO_DISABLE_RECOVERY=1
valgrind --tool=callgrind --simulate-cache=yes --dump-instr=yes ./soffice.bin /path/to/your/document

And then trigger the slowness a few times & exit.

Thanks !
Comment 19 rlk 2012-12-18 14:49:11 UTC
Where do I get a 4.0 build (Linux x86-64)?
Comment 21 rlk 2012-12-21 02:49:45 UTC
It's significantly faster.  Not as fast as I remember 3.4.5, but the spreadsheet has grown since then.  With autocalculate turned on, the delay is maybe 3-5 seconds; with it turned off, it's still 1-2 seconds (I don't see why there should be any delay with autocalculate turned off).
Comment 22 Markus Mohrhard 2013-02-26 21:53:59 UTC
Can we remove the regression keyword?

It seems it is now only a normal bug.
Comment 23 Michael Meeks 2013-04-26 09:29:54 UTC
Seems reasonable to me - if we're back to historic performance levels.
Nice work Markus - and of course, the hope is that Kohei's re-structuring work will improve all sheet formulae performance for 4.2 :-)
Comment 24 Zeki Bildirici 2013-09-27 13:16:26 UTC
Can you reproduce it with the most recent version 4.1.1.?

On Windows/portable versions data was entered instantly. Will try on my Linux machine in the weekend. (writing to notify as 5 months passed)

Hope it is improved/fixed for you.

Best regards,
Comment 25 rlk 2013-09-27 13:23:10 UTC
The spreadsheet by now is quite a bit bigger (about 10 MB).  I will try to attach it.  It takes several seconds (i7-920XM, 16 GB RAM) to input more data.

One thing I have determined is that use of conditional formatting greatly slows it down.  I've eliminated all use of conditional formatting; before doing that, it took easily 30 seconds to enter even one data element.
Comment 26 rlk 2013-09-27 13:40:08 UTC
No luck attaching it.  Can the attachment size limit be raised?  Scalability problems are hard to reproduce if it's not possible to add large files.
Comment 27 Joel Madero 2013-09-27 14:19:36 UTC
Unfortunately it cannot as FDO is not our infra. Best bet is to upload it to a third party (maybe google docs or some other such service) and link it here
Comment 28 retired 2013-11-17 13:43:40 UTC
Working on OS X 10.9 LO Version: 4.2.0.0.alpha1+
Build ID: 868103846b9b32bfecd77c08055fdca69d0265c2
TinderBox: MacOSX-x86@48-TDF, Branch:master, Time: 2013-11-14_23:51:46
in LO 4.1.3.2 working fine as well.

Ubuntu 13.10 LO Version: 4.1.2.3
Build ID: 410m0(Build:3)
working as well.

Please re-open if you still have this problem with the latest LO release.
Comment 29 Robinson Tryon (qubit) 2015-12-15 11:31:55 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]