Bug 101149 - PERFORMANCE: Opening of large doc file with table is slow
Summary: PERFORMANCE: Opening of large doc file with table is slow
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: low minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:doc, haveBacktrace, perf
Depends on:
Blocks: DOC-Opening
  Show dependency treegraph
 
Reported: 2016-07-27 10:49 UTC by Mikeyy - L10n HR
Modified: 2021-04-12 11:56 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
DOCX (363.27 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-07-27 10:49 UTC, Mikeyy - L10n HR
Details
DOC (1.28 MB, application/msword)
2016-07-27 10:51 UTC, Mikeyy - L10n HR
Details
Callgrind output from master (7.26 MB, application/x-xz)
2019-03-28 09:08 UTC, Buovjaga
Details
Perf flamegraph (2.46 MB, image/svg+xml)
2019-04-05 11:35 UTC, Buovjaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mikeyy - L10n HR 2016-07-27 10:49:31 UTC
Created attachment 126430 [details]
DOCX

This is followup on bug 53816 which was eventually fixed by another bug.
Same files attached to that bug span over 115 pages, exactly same file saved in both formats by MS 2010. In MSO it opens quite fast, while in LO it takes ages.

Test system:
Win 8.1, 64bit, latest patches, i5-4210U, 8GB RAM

Perfomance in LO, not cold start (alread opened some files after reboot):
DOCX - 52 seconds
DOC - 1 min 6 seconds
Comment 1 Mikeyy - L10n HR 2016-07-27 10:51:53 UTC
Created attachment 126431 [details]
DOC
Comment 2 Mikeyy - L10n HR 2016-07-27 11:05:30 UTC
Just to get everything in one place, I'm linking my comment about performance from bug 53816#c7

(Quote Mikeyy - L10n HR from comment #7)
> Still valid bug.
> LO 4.4.2.2 on Windows 7, 64 bit, Intel i3 3240, 8 GB of ram, SSD disk
> 
> DOCX - takes 19 sec to open, opens only 3 pages (first empty)
> DOC - takes 35 sec to open, opens only 3 pages (first empty)
> 
> Windows WordPad opens DOCX file in 9 seconds, all 100+ pages.
> 
> LO 3.3.0.4, portable version, needs 76 seconds to open DOC document, it
> opens 3 pages (first empty).

So it's degradation of perfomance from LO 4.4.2.2, but that LO version wasn't able to open all pages.
Comment 3 Aron Budea 2016-07-28 05:16:43 UTC
Reproduced, documents open slowly for me as well in master (non-debug) build in Windows 7.
Comment 4 family-guy 2017-06-01 11:33:19 UTC
Reproduced in LO 5.3.3.2 (x64)

I can see that the problem is when LibreOffice "span" the table in multiple pages.
It's hanging even if I save the file in .odt and reopen it, you can see the page count painfully slow going from 3 to 115 (in my case).
Comment 5 QA Administrators 2018-06-02 03:10:18 UTC Comment hidden (obsolete)
Comment 6 Timur 2018-10-03 08:53:47 UTC
I didn't measure, but I see dump on fileopen in 6.2+.
Comment 7 Mikeyy - L10n HR 2018-10-03 09:04:12 UTC
This has actually gotten worse over time. On latest LO 6.1.2.1 now it takes few minutes to open docx.
Comment 8 Buovjaga 2019-03-28 08:07:29 UTC
Latest master, using OOO_EXIT_POST_STARTUP=1, DOCX opens in
real    0m23,830s
user    0m23,511s
sys     0m0,249s

6.2.2
real    0m25,649s
user    0m25,332s
sys     0m0,259s

I guess it could still be faster. I should do a callgrind trace.

Arch Linux 64-bit
Version: 6.3.0.0.alpha0+
Build ID: 9c5d33e3c9e4a680af61a9e7af8fa73d08b33834
CPU threads: 8; OS: Linux 5.0; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 28 March 2019

Arch Linux 64-bit
Version: 6.2.2.2
Build ID: 6.2.2-2
CPU threads: 8; OS: Linux 5.0; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Comment 9 Buovjaga 2019-03-28 09:08:03 UTC
Created attachment 150343 [details]
Callgrind output from master

Opening LO_TEST.docx
Comment 10 Mikeyy - L10n HR 2019-03-28 09:14:59 UTC
I'm not using new LO 6.2, at least not until get next .3 release so right now I can just compare 6.1.5.2.

It took 92 seconds to open docx file.

Version: 6.1.5.2
Version ID: 90f8dcf33c87b3705e78202e3df5142b201bd805
CPU threads: 4; OS: Windows 6.3;
Comment 11 Buovjaga 2019-04-05 11:35:12 UTC
Created attachment 150544 [details]
Perf flamegraph

Opening LO_TEST.docx

Arch Linux 64-bit
Version: 6.3.0.0.alpha0+
Build ID: 558956dc811a1f0f07411e348f7081a467bbc3b5
CPU threads: 8; OS: Linux 5.0; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 4 April 2019
Comment 12 Timur 2020-10-12 12:36:56 UTC
.docx is 2007 version, similar if resaved in MSO, somewhat slow to open in LO 7.1+ but fast after that.
MSO is also slow to count 114 pages.
Comment 13 NISZ LibreOffice Team 2021-02-16 09:36:17 UTC
bug #100139 also contains 100+ page long tables in docx.
Comment 14 Xisco Faulí 2021-02-19 16:29:10 UTC
For me, it takes

real	0m16,860s
user	0m16,175s
sys	0m0,544s

in

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 7b649f835cc00ed76927c6821a135605609bed4e
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Should it be closed as RESOLVED WORKSFORME ?
Comment 15 Mikeyy - L10n HR 2021-02-19 21:08:38 UTC
AMD Ryzen 5 3600
Windows 10

LO 7.0.4 and LO 7.1.0 release versions.
DOC 28s
DOCX 28s - this one shows first page early, maybe 10-12s, but it doesn't load whole document, and LO in unresponsive. After it loads all 115 pages it becomes responsive again.
Comment 16 NISZ LibreOffice Team 2021-04-12 11:53:14 UTC
Docx import started to be fast (around 20 seconds instead of 50) on my old-ish machine in 6.3 after:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=e14056e6e88d9b8d988b7b88b2776a8fc952031b


author	Michael Stahl <Michael.Stahl@cib.de>	2019-05-22 17:03:09 +0200
committer	Michael Stahl <Michael.Stahl@cib.de>	2019-05-23 10:38:30 +0200

tdf#119109 sw: fix iteration in SwFrame::PrepareMake()

DOC format attachment 126431 [details] is still imported in about 50 seconds in 6.3 and current 7.2 bibisect master:

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: da6c70efbc991f1fc61aace267dd4f972dedce6c
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: bs-BA (hu_HU); UI: en-US
Calc: threaded
Comment 17 NISZ LibreOffice Team 2021-04-12 11:56:29 UTC
Let's focus only on the doc format in this one.