Bug 116400 - Very time lengthy PDF-generation nowhere near OO speed ( see comment 9 and 19 and 31 )
Summary: Very time lengthy PDF-generation nowhere near OO speed ( see comment 9 and 19...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.0.2.1 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.4.0
Keywords: bibisected, bisected, filter:pdf, perf
: 126344 134652 (view as bug list)
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2018-03-14 13:15 UTC by Daniel Grigoras
Modified: 2022-04-20 13:07 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Flamegraph (132.23 KB, application/x-bzip)
2019-12-18 20:25 UTC, Julien Nabet
Details
Flamegraph (143.03 KB, application/x-bzip)
2022-04-20 13:07 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Grigoras 2018-03-14 13:15:15 UTC
Description:
I tested PDF-generation of a 3700-page long masterdocument (77 subdocuments loaded) with LibreOfficeDev 6.0.0.0.alpha0 and it turned out that the PDF was produced in just about 20 minutes, as opposed to 1hr 30min with LibreOffice 6.0.2.1.
Please address this issue.

Steps to Reproduce:
-

Actual Results:  
-

Expected Results:
-


Reproducible: Always


User Profile Reset: No



Additional Info:


User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36 OPR/51.0.2830.55
Comment 1 Buovjaga 2018-03-14 15:54:29 UTC Comment hidden (obsolete)
Comment 2 Daniel Grigoras 2018-03-14 16:34:36 UTC Comment hidden (obsolete)
Comment 3 Buovjaga 2018-03-14 18:44:55 UTC
(In reply to Buovjaga from comment #1)
> Can you share a test case? Or maybe you have some older report with a
> similar masterdocument already that you can point to?
> Note that we have raised the attachment max. size to 30 megabytes now.

Thanks. For speeding up the test, I only inserted subdocs 01 and 02.
In 6.0.2 the exporting took 1 min 10 seconds.
In 6.1 (master build), it only took 7 seconds!

So good news for you, I think.

You could test (with a minimal setup like myself) with a master build and see how it goes: https://dev-builds.libreoffice.org/daily/master/Win-x86_64@42/current/

Arch Linux 64-bit
Version: 6.1.0.0.alpha0+
Build ID: 6a9326803c01f4c9bc7da855053ce4e80646fad8
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on March 14th 2018

Arch Linux 64-bit
Version: 6.0.2.1.0+
Build ID: 6.0.2-1
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Comment 4 Daniel Grigoras 2018-03-16 10:28:45 UTC
(In reply to Buovjaga from comment #3)

Indeed, PDF-generation is very fast in LibreOfficeDev 6.1.
Looking forward to having this improvement integrated in the official release.
Comment 5 Buovjaga 2018-03-16 12:40:52 UTC
I tried with the upcoming 6.0.3 on Windows and it was fairly quick. I was actually going to bisect the issue as you said it was fast in 6.0.0.0 alpha0.
Let's close.
Comment 6 Daniel Grigoras 2018-03-16 13:46:10 UTC
(In reply to Buovjaga from comment #5)

6.1 alpha generates the PDF faster than 6.0 alpha.
Comment 7 Daniel Grigoras 2018-08-30 10:17:41 UTC
This issue has resurfaced in LibreOffce 6.1.0.3
Comment 8 Telesto 2018-08-30 10:30:32 UTC
(In reply to Daniel Grigoras from comment #7)
> This issue has resurfaced in LibreOffce 6.1.0.3

Didn't check, but would suspect the same cause as for for bug 119340 or bug 119173 ( HFONT fallback handing / lifecycle)
Comment 9 Xisco Faulí 2019-12-18 14:52:52 UTC
Dropdop's link is no longer available. Instead we can use this file

- attachment 140185 [details] (bug 116068)

it takes

real	6m48,663s
user	6m41,552s
sys	0m2,257s

in

Version: 6.5.0.0.alpha0+
Build ID: fb1eac64df88baae9f211d052793773686c0e180
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded

while in

Version: 4.3.0.0.alpha1+
Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e

it takes

real	20m52,649s
user	19m44,290s
sys	0m31,152s

so there has been a real improvements over the years...
Comment 10 Xisco Faulí 2019-12-18 14:53:33 UTC Comment hidden (obsolete)
Comment 11 Julien Nabet 2019-12-18 15:03:01 UTC Comment hidden (obsolete)
Comment 12 Julien Nabet 2019-12-18 20:25:03 UTC
Created attachment 156654 [details]
Flamegraph

Here's a Flamegraph retrieve on pc Debian x86-64 with master sources updated today.
Comment 13 Xisco Faulí 2019-12-19 10:22:37 UTC
(In reply to Julien Nabet from comment #12)
> Created attachment 156654 [details]
> Flamegraph
> 
> Here's a Flamegraph retrieve on pc Debian x86-64 with master sources updated
> today.

Hi Noel,
I thought you might be interested in this issue. Maybe there something else that we can benefit from compared to bug 112989
Comment 14 Noel Grandin 2019-12-19 11:05:32 UTC
This looks exactly like the PDF stuff I improved this week, if that's not good enough, there is nothing more I can do
Comment 15 Xisco Faulí 2019-12-19 11:10:49 UTC Comment hidden (obsolete)
Comment 16 Xisco Faulí 2022-02-08 11:27:44 UTC
export time is

real	1m8,366s
user	1m8,177s
sys	0m0,326s

in

Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: 9d02b1edafd44b75a8996a97c329fdd4967e8f54
CPU threads: 8; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: es-ES (es_ES.UTF-8); UI: en-US
Calc: threaded
Comment 17 Timur 2022-02-08 14:36:57 UTC
Tested on the same system (this is Zsh display of time): 
4.3m:	261,40s user 9,92s system 98% cpu 4:35,32 total
6.0m:	 63,17s user 10,55s system 89% cpu 1:22,07 total
6.2o: 	 62,09s user 10,33s system 90% cpu 1:19,84 total
6.2m:  	100,29s user 2,70s system 99% cpu 1:43,14 total
6.4m:	109,76s user 0,80s system 88% cpu 2:05,03 total
7.0m:	116,71s user 0,88s system 96% cpu 2:01,63 total
7.4+: 	 96,33s user 0,66s system 99% cpu 1:37,30 total

I don't see reported problem with 6.0, maybe something temporary.
While export time generally improved, it also regressed in 6.2.
Comment 18 Xisco Faulí 2022-02-08 14:48:25 UTC
Not a regression per se, see https://bugs.documentfoundation.org/show_bug.cgi?id=112989#c24
Comment 19 Timur 2022-02-10 08:15:31 UTC
6.2 Linux shows where is the slow down od 30 seconds:
commit 26e935342ec111771a431e6494a2ada5b67b26fb
Date:   Mon Jul 16 18:31:32 2018 +0200
    source sha:86dfa34c6d83b70923d462fecad316dafd9a1fc4
    pre sha:f543b6a0ac6cf30922c1a1ae9bfce1d605f1d4f1
author	Eike Rathke <erack@redhat.com>	2018-07-16 
Upgrade to ICU 62.1

Looks like a regression for me, I don't see a connection to Bug 112989.
Other bugs where the same commit is a culprit: bug 126344 and bug 134652.
So this is a problem (High for multiple bugs), but what is ICU, where to report?
Comment 20 Timur 2022-02-10 08:20:01 UTC
*** Bug 134652 has been marked as a duplicate of this bug. ***
Comment 21 Timur 2022-02-10 08:21:23 UTC
Bug 134652 is the same file, same commit.
Open attachment 140185 [details]. Monitor the waiting time until CPU drops to around 0%.
Actual Results:    90 seconds
Expected Results:  12 seconds with 6.1
Comment 22 Timur 2022-02-10 08:22:24 UTC
*** Bug 126344 has been marked as a duplicate of this bug. ***
Comment 23 Timur 2022-02-10 08:23:42 UTC
Bug 126344 is the same file, same commit.
Steps to Reproduce:
1. Open attachment 140185 [details]
2. Press F5
3. Go to page 200
4. Set cursor at the beginning of pag 200
5. Press Shift & scroll to the bottom and select the end of the sentence (ergo select everything from pag 200 until end)
6. Press Delete (wait until the CPU usage is down to nearly zero)
7. Press Undo (CTRL+Z

Actual Results:
30 seconds CPU spike after press undo. It's slower and more CPU hogging

Expected Results:
12 seconds spike in LibO 4.4.7.2 (comparison is bit off.. previously the scheduling had more of idle way of doing things.. but it's to slow for sure..)
Comment 24 Buovjaga 2022-02-10 08:48:59 UTC
(In reply to Timur from comment #19)
> So this is a problem (High for multiple bugs), but what is ICU, where to
> report?

https://icu.unicode.org/
Comment 25 Telesto 2022-02-10 09:17:35 UTC
(In reply to Timur from comment #19)
> So this is a problem (High for multiple bugs), but what is ICU, where to
> report?

Well the big question is, is it a implementation error at LibreOffice side or a ICU bug. The performance degradation is - IIRC- being caused by the Break Iterator

Changelog of 62 (https://icu.unicode.org/download/62#TOC-ICU4C-Download) mentions a change here:
"Break Iterator Rules: "Safe" rules are no longer required for correct break iterator operation. For back compatibility, existing rule sets containing safe rules will continue to work, with the safe rules they contain being ignored. The Break Iterator binary data format has been updated to reflect this change."

I admit having trouble to decipher the 'change log message' But I interpret it that ICU might ignore certain defined (and previously required) LibO rules for searching "boundary positions" since 62

So ICU break iterator might doing more work compared to previously (by design). Which might be redundant in case of LibreOffice. So might be that the upgrade did need some additional tweaking to adapt to context of LibreOffice, to improve performance.
Comment 26 Julien Nabet 2022-02-10 09:25:40 UTC
Eike: any thoughts here (since you did the last ICU updates)?
Comment 27 Commit Notification 2022-03-15 13:37:23 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/b2f0da51e36ae65d304881967605700ecee59575

make CreateTextLayoutCache() cached (tdf#116400)

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 28 Commit Notification 2022-03-15 13:37:39 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/47fe67de351dd4adbfe69247e0506d0766991813

use CreateTextLayoutCache() in PDF export (tdf#116400)

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 29 Commit Notification 2022-03-15 13:37:54 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/64be4ee4dae524fc8dc9b49b618fcd55bb2d6104

use CreateTextLayoutCache() in writer (tdf#116400)

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 30 Commit Notification 2022-03-15 13:39:09 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/a514e402d67aee558346850ffb023e81f89224c2

use GetVclCache() in one more place in SwFntObj (tdf#116400)

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 31 Timur 2022-03-16 16:15:21 UTC
Testing again headless convert of 399 pages ODT attachment 140185 [details] to PDF as in Comment 17, with those fixes time went down from 99/89 to 55/40 secs (1st/2nd time run). 
So it's the best time from awful 4.3 master, comparable to 4.1 time. 
41max and 42max need 30-40 secs.

In 43max time went up from 30 to 261 with single commit: 
    source-hash-1615b7f1d078b2bdf22a856066346e701f816b72
    author	Khaled Hosny <khaledhosny@eglug.org>	2014-01-17 
    Do proper script itemization with HarfBuzz

OO is still faster with just a few seconds, as is 43all oldest.
Slow down from 8 to 20 seconds was in (rather large range): 
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=15af925c254f27046427de70a59011e2ac3d6bdb..836822522a2e9f009c0870cbbcd48d45bbd3c622
Comment 32 Commit Notification 2022-04-07 21:15:43 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/c5199abfc1a468b28c680faf41e2f11bc1729c47

work around ICU performance problem in text breaking (tdf#116400)

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 33 Luboš Luňák 2022-04-09 14:07:08 UTC
The upstream ticket is https://unicode-org.atlassian.net/browse/ICU-21946 , they've acknowledged the problem but no upstream solution so far. I've pushed my workaround to external/, let's consider this handled.
Comment 34 Xisco Faulí 2022-04-20 09:38:16 UTC
(In reply to Xisco Faulí from comment #9)
> Dropdop's link is no longer available. Instead we can use this file
> 
> - attachment 140185 [details] (bug 116068)
> 

it takes

real	1m59,721s
user	1m59,233s
sys	0m0,982s

in

Version: 7.3.0.0.alpha1+ / LibreOffice Community
Build ID: 229123ccc6f90ebf66b3e659bebbd53f8a9bdd3a
CPU threads: 8; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: es-ES (es_ES.UTF-8); UI: en-US
Calc: threaded

while in

Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: 8e4453c2117b6c3bb15be6b949a0a8a43df66647
CPU threads: 8; OS: Linux 5.10; UI render: default; VCL: gtk3
Locale: es-ES (es_ES.UTF-8); UI: en-US
Calc: threaded

it takes

real	0m27,724s
user	0m26,971s
sys	0m1,013s

nice improvement
Comment 35 Julien Nabet 2022-04-20 13:07:40 UTC
Created attachment 179682 [details]
Flamegraph

I don't know if it can be even better but just in case, I retrieved a new Flamegraph on pc Debian x86-64 with master sources updated today and exporting into pdf the Xisco's file.