Bug 133976 - Export to EPUB takes a long time
Summary: Export to EPUB takes a long time
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.0.0.0.beta1+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: haveBacktrace, perf
Depends on:
Blocks: EPUB-Export Performance
  Show dependency treegraph
 
Reported: 2020-06-14 08:40 UTC by lenochod
Modified: 2025-01-06 09:08 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
ODT file for Export (2.00 MB, application/vnd.oasis.opendocument.text)
2020-06-14 08:40 UTC, lenochod
Details
Perf flamegraph (630.82 KB, image/svg+xml)
2020-06-14 12:27 UTC, Buovjaga
Details
LibreOffice stops responding (210.55 KB, image/jpeg)
2023-07-10 09:18 UTC, Sophie Sipasseuth
Details
Growth of export times (102.04 KB, image/png)
2024-12-31 13:07 UTC, Alexandre Sena Coelho
Details
f23af7a56dda5aabe8ba3616566fbfe75805759f (2.75 MB, image/svg+xml)
2025-01-05 21:40 UTC, Alexandre Sena Coelho
Details

Note You need to log in before you can comment on or make changes to this bug.
Description lenochod 2020-06-14 08:40:54 UTC
Created attachment 161979 [details]
ODT file for Export

Steps to reproduce:

1. Open Nahrubo.odt document (It is in Attachment)
2. File -> Export As -> Export As EPUB...
3. In new window setting Layout Method: Fixed
4. Click to button OK
5. Writer is Freeze

I Test in:
- Windows 10 (1909) (64bit) with LibreOffice 7.0.0 beta1 -> Result: Freeze
- Fedora 32 (64 bit) with LibreOffice 6.4.4.2 -> Result: Freeze
Comment 1 Telesto 2020-06-14 12:10:03 UTC
Not really freezing
* Lacking a progress bar 
* Being slow on Windows by default

However there might be something else going on too.. it really does take long long time - > 3 minutes. Needs to checked against Linux; does a better job creating epubs in general
Comment 2 Buovjaga 2020-06-14 12:27:45 UTC
Created attachment 161982 [details]
Perf flamegraph

Arch Linux 64-bit
Version: 7.1.0.0.alpha0+
Build ID: e43d8d0f91b6949fa3914d034f9b9c166740afcf
CPU threads: 8; OS: Linux 5.7; UI render: default; VCL: kf5
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Built on 14 June 2020
Comment 3 Buovjaga 2020-06-14 12:29:05 UTC
I aborted the perf run after a while, it is not the full export time.
Comment 4 lenochod 2020-06-16 11:42:51 UTC
I waited 25 minutes, the epub file was not created.
There are only temp files (*.tmp and .~lock.Nahubo.epub#) on the hard disk.
Linux display window for killing LibreOffice or for Waiting.
Comment 5 BogdanB 2021-07-31 06:34:09 UTC
Also in
Version: 7.1.5.2 / LibreOffice Community
Build ID: 85f04e9f809797b8199d13c421bd8a2b025d52b5
CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3
Locale: ro-RO (ro_RO.UTF-8); UI: en-US
Calc: threaded
Comment 6 Sophie Sipasseuth 2023-07-10 09:18:39 UTC
Created attachment 188292 [details]
LibreOffice stops responding

LibreOffice stops responding and the busy cursor appears.
Moreover, my computer makes a lot of noise.
And after 12 minutes it still continues.

Version: 7.2.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: ff2ba77f22b2e96f96f5537aec1705956b47583d
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: fr-FR (fr_FR); UI: en-US
Calc: CL
Comment 7 Sophie Sipasseuth 2023-07-10 09:51:48 UTC
Idem for

Version: 7.3.8.0.0+ (x64) / LibreOffice Community
Build ID: e1ad83ddb2f39419fb5d7c69eba51e2b9f49c788
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: fr-FR (fr_FR); UI: en-US
Calc: CL

Version: 7.4.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: c94961c6869c34b3874d21cfaa5ec1488609acfe
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: fr-FR (fr_FR); UI: en-US
Calc: CL

Version: 7.5.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 1c629ca0048670db4bed5e7d8d76bcf8e81f2158
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: fr-FR (fr_FR); UI: en-US
Calc: CL threaded

Version: 7.6.0.0.beta1+ (X86_64) / LibreOffice Community
Build ID: 1b5cee822e0bc15ddbdfc86926678ca35ab3e082
CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: fr-FR (fr_FR); UI: en-US
Calc: CL threaded
Comment 8 Sophie Sipasseuth 2023-07-10 10:31:46 UTC
I will try to do a bi-bisection.
I guess it is necessary here.
Comment 9 Buovjaga 2023-07-10 11:04:17 UTC
(In reply to Sophie Sipasseuth from comment #8)
> I will try to do a bi-bisection.
> I guess it is necessary here.

As EPUB export was added in 6.0: https://wiki.documentfoundation.org/EPUB#Converting_ODF_to_EPUB
I think it would make sense to first check with 6.0. If the problem is already in that, there is no need for bibisecting.
Comment 10 Sophie Sipasseuth 2023-07-10 11:13:35 UTC
Yes, you are right.
I was thinking this to myself.
Comment 11 Sophie Sipasseuth 2023-07-10 11:14:50 UTC
But I do have not enough space on my computer to download this repository.
Comment 12 Buovjaga 2024-12-29 15:21:04 UTC
Still slow, killed after a couple of mins.

Arch Linux 64-bit
Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 960db01dbfe2f916b91782da03532fae1f836445
CPU threads: 8; OS: Linux 6.12; UI render: default; VCL: kf6 (cairo+wayland)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: CL threaded
Built on 29 December 2024
Comment 13 Alexandre Sena Coelho 2024-12-31 13:03:49 UTC
Version: 24.8.3.2 (X86_64) / LibreOffice Community
Build ID: 480(Build:2)
CPU threads: 4; OS: Linux 6.11; UI render: default; VCL: gtk3
Locale: pt-BR (pt_BR.UTF-8); UI: pt-BR
Ubuntu package version: 4:24.8.3-0ubuntu0.24.10.1
Calc: threaded

I have been working to better understand what is happening.
I conducted an analysis of LibreOffice's behavior when exporting to the EPUB format with the Layout Method set to "Fixed".
My analysis involved observing the export times according to the document size.
I used the original document and divided it into several parts, incrementing the number of pages by 100 each time.
This allowed me to create 10 documents: one with 100 pages, another with 200 pages, and so on.
Here are the results:

Pages	Time (seconds)
100	19.16
200	63.92
300	163.04
400	367.91
500	641.14
600	1004.44
700	1534.58
800	2251.39
900	3243.18
1013	4389.13

After analyzing the data, I determined that the export time growth approximates a cubic growth pattern.
Here is the graph (export-time.png) comparing the real measured data (red points) and the cubic fit (blue dashed line).
The cubic growth model closely follows the real data, especially for larger documents.
This alignment confirms that the export time increases according to a cubic relationship with the number of pages.

This analysis leads to the hypothesis that there may be an optimization issue during export when using the "Fixed" Layout Method.
The next step would be to analyze the EPUB export routine with the Fixed Layout Method to identify possible optimization issues.
Comment 14 Alexandre Sena Coelho 2024-12-31 13:07:02 UTC
Created attachment 198340 [details]
Growth of export times
Comment 15 Buovjaga 2024-12-31 15:10:59 UTC
(In reply to Alexandre Sena Coelho from comment #13)
> This analysis leads to the hypothesis that there may be an optimization
> issue during export when using the "Fixed" Layout Method.
> The next step would be to analyze the EPUB export routine with the Fixed
> Layout Method to identify possible optimization issues.

Ok, note that peformance issues should normally be investigated with a build that does not have debug features enabled. For performance issues I have another build that only has the --enable-symbols option, so I can still get perf traces: https://wiki.documentfoundation.org/Development/How_to_debug#Performance_debugging_(perf)

My flamegraph in comment 2 is from 2020, so probably a good first step would be to get a fresh one and look at where the time is spent now.
Comment 16 Alexandre Sena Coelho 2025-01-05 21:40:29 UTC
Created attachment 198386 [details] f23af7a56dda5aabe8ba3616566fbfe75805759f
Comment 17 Alexandre Sena Coelho 2025-01-05 21:42:47 UTC
(In reply to Buovjaga from comment #15)
> (In reply to Alexandre Sena Coelho from comment #13)
> > This analysis leads to the hypothesis that there may be an optimization
> > issue during export when using the "Fixed" Layout Method.
> > The next step would be to analyze the EPUB export routine with the Fixed
> > Layout Method to identify possible optimization issues.
> 
> Ok, note that peformance issues should normally be investigated with a build
> that does not have debug features enabled. For performance issues I have
> another build that only has the --enable-symbols option, so I can still get
> perf traces:
> https://wiki.documentfoundation.org/Development/
> How_to_debug#Performance_debugging_(perf)
> 
> My flamegraph in comment 2 is from 2020, so probably a good first step would
> be to get a fresh one and look at where the time is spent now.

Thank you for your response, Buovjaga.
I checked the information in the link you provided, and these are indeed essential tools for this type of analysis.
I also generated a new FlameGraph (flamegraph20250105.svg) using the latest available version (Build ID: f23af7a56dda5aabe8ba3616566fbfe75805759f).

This issue seems to be more complex than what my current beginner-level skills with the project allow me to handle.
I suspect that the problem might be related to recalculating the layout of all previous pages whenever a new page is processed — but this is just a hypothesis.
To work on this directly, I would need to better understand how rendering, pagination calculation, cursors, and other technical details work.

One thing I noticed — and I’m not sure if it’s relevant — is that, between builds a9966e81381059a3a9d8fc4d391ba17d99385fee and f23af7a56dda5aabe8ba3616566fbfe75805759f, the processing time increased significantly.

I could try to set up a test suite to compare the performance of EPUB export with the Fixed Layout across different builds, using a diverse set of documents as samples. Would that be useful? I’m not sure if something like this already exists.
Comment 18 Buovjaga 2025-01-06 09:08:35 UTC
(In reply to Alexandre Sena Coelho from comment #17)
> One thing I noticed — and I’m not sure if it’s relevant — is that, between
> builds a9966e81381059a3a9d8fc4d391ba17d99385fee and
> f23af7a56dda5aabe8ba3616566fbfe75805759f, the processing time increased
> significantly.

For investigating regressions, we have repositories of binaries that can be used with git bisect:

https://wiki.documentfoundation.org/QA/Bibisect
https://wiki.documentfoundation.org/QA/Bibisect/Linux

The commits you reference are quite fresh, so probably they are not yet included in the bibisect-linux-64-25.8 repository.

It is even possible to automate bisection, which can be nice for cases where something takes a long time. Here is an example for conversion time: https://wiki.documentfoundation.org/QA/Bibisect/Automation#Measuring_conversion_time

However, I've found the results from such runs can be unreliable, so one should always manually test the blamed commit vs. the previous commit.

The tutorial can be a kind of cheatsheet for the steps: https://wiki.documentfoundation.org/QA/Bibisect/Bibisecting_tutorial

> I could try to set up a test suite to compare the performance of EPUB export
> with the Fixed Layout across different builds, using a diverse set of
> documents as samples. Would that be useful? I’m not sure if something like
> this already exists.

We do have this automated setup running callgrind: https://perf.libreoffice.org/ but I don't know much about it. The Jenkins job is https://ci.libreoffice.org/job/lo_callgrind_linux/