Bug 57530 - FILEOPEN XLS: Severe performance regressions in LO compared to Open Office
Summary: FILEOPEN XLS: Severe performance regressions in LO compared to Open Office
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.0.0.0.alpha0+ Master
Hardware: All All
: high normal
Assignee: Markus Mohrhard
URL:
Whiteboard: target:4.1.0 target:4.0.0.1
Keywords: perf, regression
Depends on:
Blocks:
 
Reported: 2012-11-25 22:16 UTC by Stephan van den Akker
Modified: 2015-12-15 11:35 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stephan van den Akker 2012-11-25 22:16:34 UTC
Contrary to claims that LO is faster than Open Office, in some of my real world test cases I find rather severe performance regressions in recent versions of LO Calc.

Test case: Excel 97 Spreadsheet with 3 sheets containing 4 time series of 12.000 measurements each, and 3 sheets with X-Y graphs of this data.

The file contains confidential information, so it can't be posted publicly on fdo. It may be sent privately to Calc devs and QA folks

This file will load in less than 2 seconds in Excel 97 or Excel 2007 on any machine I ever tested.

I converted this file to ods in Open Office Calc 3.3.0.

Average load times (seconds)  XLS  ODS
- OOo 3.3.0                    46   15
- LibO 3.6.3.1                 83   17
- LibO 4.0.0 alpha *)          91   39

*) Version 4.0.0.0.alpha1+ (Build ID: 81e4968fdde5bfc68e916db25d43125631daa97), daily build 25-11-2012.

All tests done on Windows XP Professional SP 3, Pentium D, 2.8 GHz, 1 GB internal memory.Average load times excluding loading of Calc, and excluding first load time.

Similar performance regressions witnessed on a 64-bit openSuSE system with quad-core processor and 4 GB of memory.

Although some of my recent debug builds of LibOdev (up to the recent numbering change from 3.7 -> 4.0) where significantly faster than these alpha builds.

Calc 3.3.0 is already considered unworkably slow by some of my colleagues at the office, and I can't really blame them.

Needles to say we will not be deploying LibO anytime soon...

@kohei: Seems like a nice test case for your Orcus library.
Comment 1 Joel Madero 2012-12-06 18:11:14 UTC
Would you be willing to get that document to me (I am one of the QA people), I'll verify the bug and see if a bt might be useful
Comment 2 Stephan van den Akker 2012-12-06 19:07:04 UTC
Sure.

The attached version is without the sensitive info, so feel free to spread
it around. fdo won't let me add it as an attachment (too big).

Had a long discussion with Michael Meeks about this one on the mailing
list. In reduced form it will probably end up in the set of test documents
for the performance tests that will run on the tinderboxes. I don't know if
it will be running on the Windows tinderboxes though (loperf.sh = unix
shell script with lot's of dependencies on Linux tools like valgrind, sed,
etc...)

Stephan


2012/12/6 <bugzilla-daemon@freedesktop.org>

>  Joel Madero <jmadero.dev@gmail.com> changed bug 57530<https://bugs.freedesktop.org/show_bug.cgi?id=57530>
>  What Removed Added  CC   jmadero.dev@gmail.com
>
>  *Comment # 1 <https://bugs.freedesktop.org/show_bug.cgi?id=57530#c1> on bug
> 57530 <https://bugs.freedesktop.org/show_bug.cgi?id=57530> from Joel
> Madero <jmadero.dev@gmail.com> *
>
> Would you be willing to get that document to me (I am one of the QA people),
> I'll verify the bug and see if a bt might be useful
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 3 Joel Madero 2012-12-06 20:43:27 UTC
I just got done triaging this one and here are my results:

Version|file format|time to open (seconds)
3.6.3.2 | xls | 25.5
3.6.3.2 | ods | 8.1
4.1.0.0alpha | xls | 29.5 
4.1.0.0alpha | ods| 6.3

I don't have a version of OOo to try with. I suspect our differences are because of hardware. 

I'm going to mark this as NEW against version 4 and put regression (because of the 25% decrease in speed). My bigger concern is that both versions have such a huge difference in time to open between xls and ods. Also interesting point is that the ods file is substantially bigger than the xls file (more than 50% larger). 

Developers might want a second bug opened to reflect the difference in time between ods and xls (vs. the performance difference between 3.6+ and 4+) but I'll hold on that until requested.

NEW (confirmed)
Normal (performance/regression related)
High (performance/regression of 25% in speed) for a larger document I suspect it would be a lot more than 6 seconds difference)

Whiteboard: regression perf (performance regression)

Thanks for the file, a developer may request it at some point, if they do please send in there way as I may miss the request in the chaos of my emails :)
Comment 4 Michael Stahl (CIB) 2012-12-18 15:02:42 UTC
regression is a keyword...
Comment 5 Markus Mohrhard 2012-12-20 01:23:56 UTC
(In reply to comment #3)
> I just got done triaging this one and here are my results:
> 
> Version|file format|time to open (seconds)
> 3.6.3.2 | xls | 25.5
> 3.6.3.2 | ods | 8.1
> 4.1.0.0alpha | xls | 29.5 
> 4.1.0.0alpha | ods| 6.3
> 
> I don't have a version of OOo to try with. I suspect our differences are
> because of hardware. 
> 
> I'm going to mark this as NEW against version 4 and put regression (because
> of the 25% decrease in speed). My bigger concern is that both versions have
> such a huge difference in time to open between xls and ods. Also interesting
> point is that the ods file is substantially bigger than the xls file (more
> than 50% larger).

This difference is expected. ODS is our native format and fits very well to our internal document model. XLS is a reverse engineered alien format that has to be mapped to our internal document model. Additionally we are not spending that much time on improving the performance of the XLS import as ODS and XLSX are the file formats for the future.

> 
> Developers might want a second bug opened to reflect the difference in time
> between ods and xls (vs. the performance difference between 3.6+ and 4+) but
> I'll hold on that until requested.

No. It is known and expected.

> 
> NEW (confirmed)
> Normal (performance/regression related)
> High (performance/regression of 25% in speed) for a larger document I
> suspect it would be a lot more than 6 seconds difference)

6 or 8 seconds for an ODS file is already a bigger file.

> 
> Whiteboard: regression perf (performance regression)
> 
> Thanks for the file, a developer may request it at some point, if they do
> please send in there way as I may miss the request in the chaos of my emails
> :)

I can have a look at it. However this might not be as bad as you think. Measuring load times without callgrind is not precisely enough that you can be sure that there is really a performance problem.
Comment 6 Markus Mohrhard 2012-12-22 02:26:30 UTC
Ok I can get this XLS opening time to something below 10 seconds by removing one insane call.

http://opengrok.libreoffice.org/xref/core/sc/source/filter/excel/xiescher.cxx#1677 does look a bit strange and is responsible for the slow import. This call has been added by dr for https://issues.apache.org/ooo/show_bug.cgi?id=12587 but somehow looks like a really bad idea. The only problem with removing it is that it might introduce some hidden regressions.
Comment 7 Stephan van den Akker 2012-12-22 10:08:08 UTC
I'll disable this line in my own build and will do some tests. Does this
line copy the referenced data from the spreadsheets to an internal storage?
Charts do that when you copy them from calc to writer or draw.


2012/12/22 <bugzilla-daemon@freedesktop.org>

>   *Comment # 6 <https://bugs.freedesktop.org/show_bug.cgi?id=57530#c6> on bug
> 57530 <https://bugs.freedesktop.org/show_bug.cgi?id=57530> from Markus
> Mohrhard <markus.mohrhard@googlemail.com> *
>
> Ok I can get this XLS opening time to something below 10 seconds by removing
> one insane call.
> http://opengrok.libreoffice.org/xref/core/sc/source/filter/excel/xiescher.cxx#1677
> does look a bit strange and is responsible for the slow import. This call has
> been added by dr for https://issues.apache.org/ooo/show_bug.cgi?id=12587 but
> somehow looks like a really bad idea. The only problem with removing it is that
> it might introduce some hidden regressions.
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 8 Not Assigned 2012-12-22 13:25:22 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb97ecdbcc8d8dafd39e728b21bc68efee4eccbc

storing the chart doc while loading is a bad idea, fdo#57530



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Markus Mohrhard 2012-12-22 14:11:00 UTC
I found no obvious regressions and this call seems to be not a good idea in general. If we really find a place where this introduces a problem we should fix it properly.

All in all this show once more that the chart2 integration into LibO/calc is horrible and we need to spend at some point cleaning up the mess.
Comment 10 Stephan van den Akker 2012-12-22 20:57:20 UTC
You won't have an argument with me on the subject of chart2 integration.


2012/12/22 <bugzilla-daemon@freedesktop.org>

>  Markus Mohrhard <markus.mohrhard@googlemail.com> changed bug 57530<https://bugs.freedesktop.org/show_bug.cgi?id=57530>
>  What Removed Added  Status NEW ASSIGNED  Assignee
> libreoffice-bugs@lists.freedesktop.org markus.mohrhard@googlemail.com
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You reported the bug.
>
>
Comment 11 Stephan van den Akker 2012-12-22 22:32:31 UTC
This patch certainly works wonders on the load times! Load times of < 10
seconds on my trusted old Pentium 4 machine (takes several minutes in
3.6.4). The "calculating" stage is almost not existent.

I do notice that switching to a sheet containing a graph now takes a very
long time. It seems your patch delays the actual loading until the graph
comes into view. After that it is fully functional and editable though.


2012/12/22 Stephan van den Akker <stephanv778@gmail.com>

> You won't have an argument with me on the subject of chart2 integration.
>
>
> 2012/12/22 <bugzilla-daemon@freedesktop.org>
>
>  Markus Mohrhard <markus.mohrhard@googlemail.com> changed bug 57530<https://bugs.freedesktop.org/show_bug.cgi?id=57530>
>>  What Removed Added  Status NEW ASSIGNED  Assignee
>> libreoffice-bugs@lists.freedesktop.org markus.mohrhard@googlemail.com
>>
>>  ------------------------------
>> You are receiving this mail because:
>>
>>    - You reported the bug.
>>
>>
>
Comment 12 Markus Mohrhard 2012-12-26 03:22:56 UTC
(In reply to comment #11)
> This patch certainly works wonders on the load times! Load times of < 10
> seconds on my trusted old Pentium 4 machine (takes several minutes in
> 3.6.4). The "calculating" stage is almost not existent.
> 
> I do notice that switching to a sheet containing a graph now takes a very
> long time. It seems your patch delays the actual loading until the graph
> comes into view. After that it is fully functional and editable though.
> 

Ok I prefer delaying it until it is really needed. So lets just push the patch to 4.0.0
Comment 13 Not Assigned 2012-12-26 03:29:13 UTC
Markus Mohrhard committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a027fdb6280bbe2b0e021e3f008f1c689510582c&h=libreoffice-4-0

storing the chart doc while loading is a bad idea, fdo#57530


It will be available in LibreOffice 4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Robinson Tryon (qubit) 2015-12-15 11:35:46 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]