Bug 135316 - FILEOPEN DOCX: Time to open from 18 to 30-45 seconds
Summary: FILEOPEN DOCX: Time to open from 18 to 30-45 seconds
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0.0.5 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0 target:7.2.0.0.beta2
Keywords: bibisected, bisected, filter:docx, perf, regression
Depends on:
Blocks: DOCX-Opening
  Show dependency treegraph
 
Reported: 2020-07-30 12:22 UTC by Telesto
Modified: 2021-07-07 13:41 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample DOCX exported with 7.1 (6.06 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-07-30 12:24 UTC, Telesto
Details
Source ODT (6.14 MB, application/vnd.oasis.opendocument.text)
2020-07-30 12:24 UTC, Telesto
Details
perf flamegraph (363.23 KB, application/x-bzip)
2021-04-16 18:05 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Telesto 2020-07-30 12:22:30 UTC
Description:
FILEOPEN DOCX: Time to open from 18 to 30-45 seconds

Steps to Reproduce:
1. Open the attached docx
2. Monitor they time to file shows Or they CPU time taken until CPU drop to around %

Actual Results:
45 seconds

Expected Results:
16-18 second with 4.4.7.2 until file shows


Reproducible: Always


User Profile Reset: No



Additional Info:
Found in
Version: 7.1.0.0.alpha0+ (x64)
Build ID: <buildversion>
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: nl-NL (nl_NL); GI: nl-NL
Calc: CL

and in 
6.0

and in
5.3

and in 
5.0.0.5
Comment 1 Telesto 2020-07-30 12:24:05 UTC
Created attachment 163770 [details]
Sample DOCX exported with 7.1

Sample DOCX export with 7.1. Don't think it's they export side, but import
Comment 2 Telesto 2020-07-30 12:24:38 UTC
Created attachment 163771 [details]
Source ODT
Comment 3 Xisco Faulí 2020-07-30 13:12:57 UTC
today my computer is quite slow for some reason. anyway, it takes

real	3m29,005s
user	3m29,161s
sys	0m5,550s


in

Version: 7.1.0.0.alpha0+
Build ID: 231e1e416c039d1f9724962a89cf0573a3db48a2
CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 4 Gabor Kelemen (allotropia) 2021-01-05 10:33:31 UTC
This contains 880 images, import of that may be the culprit.
Comment 5 Xisco Faulí 2021-02-15 09:46:08 UTC
it takes

real	1m10,240s
user	1m4,822s
sys	0m3,166s


in

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: cbcec4425e04e3614a2025b49fdc221216ac51d3
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 6 Timur 2021-04-14 09:59:11 UTC
My tests confirmed the bug. 
In 41 ODT fileopen was slow, but that's ok with 42.
With 44max oldest DOCX fileopen is comparable to ODT, while master is slow, so it's really a regression. Seen with time command. 
7.1 master is awful, 7.2 is better but not as 4.4 oldest. 

I noticed console errors on conversion:
libpng warning iCCP: known incorrect sRGB profile 
libpng warning iCCP: CRC error
libpng error: Error reading
Comment 7 Timur 2021-04-14 11:13:18 UTC
I bibisected this in 44max to http://cgit.freedesktop.org/libreoffice/core/commit/?id=c1f8437dbed0e8b989e41a345ef7e658a6e8a4cd what would be RTF part of bug 83465. Sounds strange, but I checked bisect result, it was single commit. 

Xisco, please recheck before adding Miklos.
Comment 8 Xisco Faulí 2021-04-14 13:23:59 UTC Comment hidden (obsolete)
Comment 9 Timur 2021-04-14 14:35:37 UTC
(In reply to Xisco Faulí from comment #8)
> (In reply to Timur from comment #7)
> > I bibisected this in 44max to
> > http://cgit.freedesktop.org/libreoffice/core/commit/
> > ?id=c1f8437dbed0e8b989e41a345ef7e658a6e8a4cd what would be RTF part of bug
> > 83465. Sounds strange, but I checked bisect result, it was single commit. 
> > 
> > Xisco, please recheck before adding Miklos.
> 
> have you tried moving into the commit and then 'git checkout HEAD~1' to move
> to the previous commit ?

I did with HEAD^1, that's how I check result.
Comment 10 Timur 2021-04-16 13:36:18 UTC
Hello Miklos. Here is a strange bibisect result pointing to your commit, please see.
Comment 11 Xisco Faulí 2021-04-16 14:11:32 UTC
(In reply to Timur from comment #7)
> I bibisected this in 44max to
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=c1f8437dbed0e8b989e41a345ef7e658a6e8a4cd what would be RTF part of bug
> 83465. Sounds strange, but I checked bisect result, it was single commit. 
> 
> Xisco, please recheck before adding Miklos.

Hi Timur,
I do confirm time goes from

real	0m24,795s
user	0m23,803s
sys	0m0,784s

to

real	1m18,269s
user	1m15,406s
sys	0m0,963s

after the mentioned commit, however, in

Version: 5.2.0.0.alpha0+
Build ID: 3ca42d8d51174010d5e8a32b96e9b4c0b3730a53
Threads 4; Ver: 5.7; Render: default

the import time is about

real	0m33,333s
user	0m30,139s
sys	0m2,592s
Comment 12 Miklos Vajna 2021-04-16 14:24:17 UTC
Note that early versions of the RTF import incorrectly didn't handle lots of features, so some amount of slow-down is obviously expected as the amount of RTF markup we handle increases.

But reading the above commit doesn't ring a bell for me, so probably it's easier if a profiler is used to see where is the current hotspot, it may be somewhere else.
Comment 13 Xisco Faulí 2021-04-16 14:40:08 UTC
so, in 5-2 branch, the import time got from

real	0m26,954s
user	0m25,205s
sys	0m0,929s

to

real	0m48,216s
user	0m46,345s
sys	0m1,234s

after


author	Miklos Vajna <vmiklos@collabora.co.uk>	2016-01-07 08:19:17 +0100
committer	Miklos Vajna <vmiklos@collabora.co.uk>	2016-01-07 08:13:23 +0000
commit	f9c8d97d82a85b897520a2fe897352ee5ad879d9 (patch)
tree	668a8cc96e2dda54a0ea93e375b116db5870f6a5
parent	f84d09cdda19e51373ec0ac4afb969483be24425 (diff)
tdf#95213 DOCX import: don't reuse list label styles

which is also a regression from a commit for bug 83465
Comment 14 Xisco Faulí 2021-04-16 17:33:26 UTC
@Julien, would you mind getting a flamegraph for this issue too ?
Comment 15 Julien Nabet 2021-04-16 18:05:12 UTC
Created attachment 171245 [details]
perf flamegraph

Here's a Flamegraph retrieved on pc Debian x86-64 with master sources updated today + gen rendering.
Comment 16 Telesto 2021-04-18 08:40:57 UTC
To major creating of tremendous amount of ListLabel Character Style on DOCX export

@Justin
If I recall you did some trick for "WW8Num2z1" styles. Is possible do this for List Labels too?
Comment 17 Commit Notification 2021-06-15 11:13:33 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9966345f4faebb447d353ce68cee5765863273a2

tdf#135316 docx open performance

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2021-06-15 13:40:22 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/e3611f81ee35998e3b8382d3c0fab6d4993e4626

tdf#135316 docx open performance

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2021-06-16 13:16:41 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/e52bbca626e2cbe2f4d13632f65967604abb0abc

tdf#135316 docx open performance

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2021-06-17 08:30:35 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/a972ed87e3bcb7cdee67f25f6ce0bdbb689c4f59

tdf#135316 docx open performance

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Commit Notification 2021-06-17 18:45:10 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ecbdb403d16f6b0aeb8b543e069e9d82adf10437

 tdf#135316 docx open performance, cache next character style name

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2021-06-18 07:34:55 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/87d5ccb0503e856819ae7528d11e233ad642714f

tdf#135316 docx open performance, cache next character style name

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 Commit Notification 2021-06-21 06:48:59 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2e3046cc6fdebb52cfb1cdc114a9c2cd26bcc178

simplify bootstrap_map (tdf#135316 related)

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 24 Commit Notification 2021-06-21 08:47:08 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b30e329bfac7279d888908273baec8c7d8dd32ee

merge SwList and SwListImpl (tdf#135316 related)

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 25 Commit Notification 2021-06-21 10:08:06 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/ce85841d7d7592188c1ad3e467e29f436bc05ba2

merge SwList and SwListImpl (tdf#135316 related)

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 26 Commit Notification 2021-06-21 12:28:52 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/46d6942bc16fb291f37b0700bb531a3e0d2d11f6

tdf#135316 add small cache to rtl_bootstrap_args_open

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 27 Commit Notification 2021-06-21 16:38:12 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/218f36dd614cf828e949f605faaf6a6fd615da26

tdf#135316 remove OTempFileService pessimisation

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 28 Commit Notification 2021-06-22 07:34:23 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/08a879ac8f759c9d0c5cc9569c9c43d058cc9a16

simplify bootstrap_map (tdf#135316 related)

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 29 Commit Notification 2021-06-23 12:30:17 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9815bf197c27afdfeccf967898c3a000bcf7b256

tdf#135316 add SvFileStream::SetDontFlushOnClose

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 30 Commit Notification 2021-06-24 08:01:14 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/066c4054f4a1078602aaab5516590628eaf6a47e

tdf#135316 bypass 'existing style' check when importing

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 31 Commit Notification 2021-06-25 10:19:20 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/3837434dfb3a673a05d91b063e7ac2025589f32f

tdf#135316 add SvFileStream::SetDontFlushOnClose

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 32 Commit Notification 2021-06-25 12:52:06 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ab5ac64bdd3205ba2ba9ac038719826f703a09a3

tdf#135316 store stylesheets in a map

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 33 Commit Notification 2021-06-26 16:03:09 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/5eee5524754ab2ff62d45c1c70b025834d3a1d15

tdf#135316 remove OTempFileService pessimisation

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 34 Commit Notification 2021-06-27 14:39:59 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/770f94f4c5bedede8ee70e1f3bc1303dbace62ca

Revert "tdf#135316 bypass 'existing style' check when importing"

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Commit Notification 2021-06-29 09:30:50 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/97123add76b743013fc5c222387feb4b9c13daf2

tdf#135316 share themePtr and ShapeFilterBase across all shapes

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 36 Commit Notification 2021-06-30 12:34:24 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5ba64bba76ca1d23191300d1b5080cc091d432de

tdf#135316 make regex object static const

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 37 Commit Notification 2021-06-30 12:35:39 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/05992ce5d03aeb2db8d4fc7a68053ebd9a9aa511

tdf#135316 cache propertysetinfo in SwXShape

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 38 Commit Notification 2021-06-30 15:54:17 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/cf15c4dad74e31a035c0d1ca899dfbef4da90ad2

tdf#135316 optimise SwCharFormats::FindFormatByName

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 39 Noel Grandin 2021-06-30 16:02:14 UTC
I am considering this done, and won't be doing any more work on this
Comment 40 Telesto 2021-06-30 16:17:24 UTC
(In reply to Noel Grandin from comment #39)
> I am considering this done, and won't be doing any more work on this

FIXED seems appropriate here.. 

Note: I wondered already if the flow of commits would ever stop ;-)
Thanks for all the work!
Comment 41 Commit Notification 2021-06-30 18:58:02 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/e2173d675b55b14081e9ae3d5b188cde65ad1fae

tdf#135316 cache propertysetinfo in SwXShape

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 42 Commit Notification 2021-06-30 18:58:18 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/2577db3d6b1e59e14441704dc408200d7ce3e256

tdf#135316 make regex object static const

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 43 Commit Notification 2021-06-30 18:58:30 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/a9920e1fb8e7a1eb8158c8c699c2bf973d95bb32

tdf#135316 store stylesheets in a map

It will be available in 7.2.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 44 Xisco Faulí 2021-07-07 13:41:31 UTC
in

Version: 7.3.0.0.alpha0+ / LibreOffice Community
Build ID: eac5977bfc11797eda356560a5e45c51108ef5a1
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

it takes

real	0m14,250s
user	0m13,164s
sys	0m0,714s

while in

Version: 7.2.0.0.alpha1+ / LibreOffice Community
Build ID: ff2ba77f22b2e96f96f5537aec1705956b47583d
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

it takes

real	1m4,663s
user	1m1,795s
sys	0m1,353s

Nice improvement.

@Noel, thanks for fixing this issue. Closing as VERIFIED FIXED