Bug 122040 - Performance issue with highly formatted spreadsheet
Summary: Performance issue with highly formatted spreadsheet
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.1.3.2 release
Hardware: All All
: low minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:xls, perf
Depends on:
Blocks: XLS
  Show dependency treegraph
 
Reported: 2018-12-12 12:14 UTC by William Gathoye
Modified: 2019-09-08 10:44 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Original Excel file (12.17 MB, application/vnd.ms-excel)
2018-12-12 12:15 UTC, William Gathoye
Details
Ods conversion without direct formatting (3.71 MB, application/vnd.oasis.opendocument.spreadsheet)
2018-12-12 12:16 UTC, William Gathoye
Details
Cachegrind screenshot (173.90 KB, image/jpeg)
2018-12-12 12:17 UTC, William Gathoye
Details

Note You need to log in before you can comment on or make changes to this bug.
Description William Gathoye 2018-12-12 12:14:58 UTC
Description:
We have received a bug report via our Twitter account @LibreOfficeFR.

One of our users, being diabetic, has been noting in a XLS spreadsheet all the aliments he has been eating for a bunch of years now.

The document has two sheets and each of them has quite a bunch of rows. Most of the row has direct formatted content and some of them even have conditional formatting rules.

While the document loads rapidly on Office, the user complained it was taking a huge time to open on LibreOffice.

We have been able to reproduce the issue. While the document loads in instant on Excel (tested with latest Office 365 version), the document opens in about 1m30 on my machine with LibreOffice. 

The user is still making use of Office 2000. We first thought the issue was due to that old version, but the document is nearly identical when saving from Office 365.
The issue is not due to the XLS format either. Converting the document to ods has still the same issue. The same applies if we copy paste the content of each sheet to a brand new document and save to ods directly instead of proceeding to a xls -> ods convertion from an existing XLS file.

By removing the direct formatting (by using Ctrl+M), the opening process is shorter of 20 seconds.

Arnaud Versini has begun to bisect the issue with cachegrind, and we realized a method calling C++ Vector is being called a lot of time (the hole list being scanned over and over again at each iteration). This is a good start to search of possible optimizations.

Joined to this bug report is the XLS file (original file). We are also providing a ods conversion where direct formatting has been removed.

Steps to Reproduce:
1. Simpy open the document and see for yourselves.
2.
3.

Actual Results:
Pretty slow to load

Expected Results:
Nearly instant loading.


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Comment 1 William Gathoye 2018-12-12 12:15:31 UTC
Created attachment 147463 [details]
Original Excel file
Comment 2 William Gathoye 2018-12-12 12:16:05 UTC
Created attachment 147464 [details]
Ods conversion without direct formatting
Comment 3 William Gathoye 2018-12-12 12:17:01 UTC
Created attachment 147465 [details]
Cachegrind screenshot
Comment 4 Oliver Brinzing 2018-12-12 15:33:03 UTC
(In reply to William Gathoye from comment #0)

> We have been able to reproduce the issue. While the document loads in
> instant on Excel (tested with latest Office 365 version), the document opens
> in about 1m30 on my machine with LibreOffice. 

i can not confirm a huge delay with 

Version: 6.1.4.1 (x64)
Build-ID: 25073d18caee244880112e52c4a7e71f6081b3a9
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: 

Win10 Pro x64, Core i5-3320 2,6 GHZ, 8GB, 256 GB SSD

attached *.xls loads (including recalc) within 18 seconds, *.ods 
takes about 30 seconds.

what i noticed: the file has a reference to an external file
Comment 5 V Stuart Foote 2018-12-12 15:53:50 UTC
Hmm, so opens in about ~3sec to protected view on Excel 2016

On LibreOffice delay is noticeable...

On a robust Xeon (E3-1270 @3.50Ghz) / 32GB Windows 10 64-bit en-US (1803) workstation:

fully opening in about ~14sec on LibreOffice
Version: 6.1.3.2 (x64)
Build ID: 86daf60bf00efa86ad547e59e09d6bb77c699acb
CPU threads: 8; OS: Windows 10.0; UI render: default (or GL); 
Locale: en-US (en_US); Calc: CL

and about the same ~14sec on LibreOffice master
Version: 6.3.0.0.alpha0+ (x64)
Build ID: 34d5e910adba4094bba1303284f9552028d0b019
CPU threads: 8; OS: Windows 10.0; UI render: GL (or default); VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-12-11_02:09:16
Locale: en-US (en_US); UI-Language: en-US
Calc: CL
Comment 6 Arnaud Versini 2018-12-12 19:33:41 UTC
Major part of the problem seems to be that we insert in a std::vector at random places and this makes a lot of std::unique_ptr move.
Comment 7 Xisco Faulí 2018-12-13 11:29:33 UTC
it takes 16 seconds in

Version: 6.3.0.0.alpha0+
Build ID: e98bcfcc3cdad46620e3d59119b0ac262db88054
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded
Comment 8 Xisco Faulí 2018-12-13 11:34:14 UTC
In

Version: 4.4.0.0.alpha2+
Build ID: b21df5a993a3815cf736fe3d2eab73eee646b38e

it takes 30 seconds, so the import time has been reduced to half since then....
Since it's 16 seconds on master, which is quite acceptable IMHO, i'll decrease the severity...