Bug 122792 - Writer repeatedly rewrites screen during document load
Summary: Writer repeatedly rewrites screen during document load
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf, preBibisect, regression
Depends on:
Blocks:
 
Reported: 2019-01-17 17:32 UTC by Christian Lehmann
Modified: 2021-09-22 08:29 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
LO Writer takes ages to load a document, repeatedly rewriting the screen (14.76 MB, video/mp4)
2019-01-17 17:36 UTC, Christian Lehmann
Details
Example ODT sanitised, comments removed (438.97 KB, application/vnd.oasis.opendocument.text)
2019-01-31 12:10 UTC, Buovjaga
Details
Callgrind output from master (8.80 MB, application/x-xz)
2019-01-31 12:29 UTC, Buovjaga
Details
odt file of 1.1 MB with many comments to test load time and LO freezing (1.07 MB, application/vnd.oasis.opendocument.text)
2020-09-03 14:17 UTC, Christian Lehmann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Lehmann 2019-01-17 17:32:34 UTC
Description:
While loading an ODT document, LO Writer repositions the cursor and rewrites the screen up to six times. This unnecessarily extends loading time.

Steps to Reproduce:
1. I open LO.
2. I click on an ODT document shown on the start screen.
3. I wait until the document becomes available for editing.

Actual Results:
After the document has been loaded, it is displayed on the screen for a first time. Then the cursor hops to different locations in the document six times, always rewriting the screen, until it finally rests on one position and the document becomes available for editing.

Expected Results:
1) It is not clear why the cursor has to go to different positions upon loading a document. One would assume that it goes directly to that position where it was when the document was last closed.
2) It is bad programming to show the document on the screen while the loading algorithm is not yet done with it. The document must be displayed only after it is available for editing.


Reproducible: Always


User Profile Reset: No



Additional Info:
I enclose a video showing the process. The file has almost one 1 MB (Durga Priyanka and Ilmari have it for testing), and Windows 7 is running on an old machine: Intel Core i5-3210M, CPU @ 2.50 GHz. This gives us the opportunity to observe on the screen each of the steps that the loading algorithm is taking after first displaying the document. (We cannot see what it does in the long time preceding this moment.)
When I load the Word version of this document into MS Word, it becomes available for editing after a few seconds, even though Word is counting the pages through to the end in the background.
Comment 1 Christian Lehmann 2019-01-17 17:36:54 UTC
Created attachment 148406 [details]
LO Writer takes ages to load a document, repeatedly rewriting the screen

You have to have patience to watch the video: this is just the time that Writer takes to load the document.
Comment 2 Christian Lehmann 2019-01-18 10:29:44 UTC
Upon loading the same file today, it now seems to me that the cursor does not leap to different locations, as I thought, but instead the page on which it rests is rewritten in six different views during the process of formating and reformating. Anyway, it remains true that this iterative approximation to the final view does not need to be displayed.
Comment 3 Durgapriyanka 2019-01-18 16:52:28 UTC
Thank you for reporting the bug. I can reproduce the bug.

It takes approximately 90 seconds to load the full document. The cursor doesn't move anywhere, but a section of page on which the cursor is present(1st page) flickers about 5-6 times once the file is loaded.

Version: 6.3.0.0.alpha0+
Build ID: 3c964980da07892a02d5ac721d80558c459532d0
CPU threads: 2; OS: Windows 6.1; UI render: default; VCL: win; 
TinderBox: Win-x86@42, Branch:master, Time: 2018-12-12_02:07:45
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded
Comment 4 Christian Lehmann 2019-01-18 16:55:58 UTC
If it is the first page, the rewriting reduces to a flickering of the page, since there is no page break to be redone. If instead the cursor is on p. 300 or so, you will see Writer displaying six different page breaks.
Comment 5 Buovjaga 2019-01-27 18:06:44 UTC
(In reply to Christian Lehmann from comment #2)
> Upon loading the same file today, it now seems to me that the cursor does
> not leap to different locations, as I thought, but instead the page on which
> it rests is rewritten in six different views during the process of formating
> and reformating. Anyway, it remains true that this iterative approximation
> to the final view does not need to be displayed.

I have anonymised the document so all the content is replaced with x characters. It seems the flickering is in part tied with the comments. Would you be OK with having a version in public with the comments? Unfortunately, Replace all does not work with comments.
Comment 6 Christian Lehmann 2019-01-28 09:16:45 UTC
LO developers should know that they are free to forward the file to other LO developers. What is not possible is to make it available to everybody on the website, as this would seriously jeopardize the authors' chance of getting it printed in the future.

I trust that it may be helpful for developers to have a big file available for testing that is a real working document making full use of LO features instead of a repetition of Lorem ipsum.

Some observations on the problem that I reported:
1) You need to check loading behavior for different positions of the cursor in the (saved) document. It seems that sometimes it does leap to different locations during the loading process and other times it remains on the same spot, but only the display shifts up and down.

2) Part of the problem is apparently in the pagination: While LO Writer analyzes the document it is loading, it apparently repaginates it repeatedly. This should be streamlined.

3) Another aspect of the problem is that the successive phases of the process of loading and analyzing the document are displayed on the screen. This is obviously a programming mistake.

4) Yet another aspect is the time that LO Writer needs to load the document. 90 seconds for a process for which MS Word needs only four seconds is simply grossly inadequate.
Comment 7 Buovjaga 2019-01-28 11:04:41 UTC
(In reply to Christian Lehmann from comment #6)
> LO developers should know that they are free to forward the file to other LO
> developers. What is not possible is to make it available to everybody on the
> website, as this would seriously jeopardize the authors' chance of getting
> it printed in the future.
> 
> I trust that it may be helpful for developers to have a big file available
> for testing that is a real working document making full use of LO features
> instead of a repetition of Lorem ipsum.

The file needs to be such that it can be shared in public. The only exception would be a file demonstrating a security issue.

If no example file can be shared, this report will have to be closed as pointless.
Comment 8 Buovjaga 2019-01-30 13:53:41 UTC
Set to NEEDINFO.
Change back to UNCONFIRMED after you have attached an example document.
Comment 9 Buovjaga 2019-01-31 12:10:12 UTC
Created attachment 148801 [details]
Example ODT sanitised, comments removed

Attaching a sanitised version with the comments removed. Unfortunately, profiling the performance of this will likely produce a different result than if we used the version with comments. I am currently getting a callgrind profile of the file opening.

However, there are existing perf reports about large numbers of comments, such as bug 60418, so I think there is sufficient data for any dev wanting to tackle these issues.
Comment 10 Buovjaga 2019-01-31 12:29:35 UTC
Created attachment 148802 [details]
Callgrind output from master

Arch Linux 64-bit
Version: 6.3.0.0.alpha0+
Build ID: 8b01361979a8e9c0f59716e2b3de65daad7c25a7
CPU threads: 8; OS: Linux 4.20; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 29 January 2019
Comment 11 Buovjaga 2019-01-31 12:40:35 UTC
Using total CPU hammering time as a yardstick, I see 3.5.0 already spends 30 secs beating up the CPU, while 3.3.0 only does it for 7 secs (basically the time the progress bar is shown, no more).
Comment 12 Telesto 2020-06-26 20:29:54 UTC
No issue as far I can tell
Version: 7.1.0.0.alpha0+ (x64)
Build ID: 006c65bbd472cb1d7d44e095714e28190b76be0d
CPU-threads: 4; Besturingssysteem: Windows 6.3 Build 9600; UI-render: Skia/Rooster; VCL: win
Locale: nl-NL (nl_NL); GI: nl-NL
Calc: CL
Comment 13 Christian Lehmann 2020-09-03 07:41:29 UTC
1) Finding no issue with a document load time of 90 seconds for which MS Word takes five seconds implies that LO Writer is not intended as a serious alternative to MS Word. A pity.
2) If you cannot make the document available for editing before pagination has finished, then don't display it on the screen before this point in time, since this only prolongs the process.
3) The obvious conclusion seems to be that much of the work to be done on loading a document should run in background threads in order not to keep the user waiting.
Comment 14 Buovjaga 2020-09-03 13:12:04 UTC
(In reply to Christian Lehmann from comment #13)
> 1) Finding no issue with a document load time of 90 seconds for which MS
> Word takes five seconds implies that LO Writer is not intended as a serious
> alternative to MS Word. A pity.

You gravely misunderstood: Telesto meant that he was not seeing a performance problem.

Like I mentioned in comment 9, the performance characteristics of the sanitised document might not match your original one. How does the sanitised attachment 148801 [details] perform for you? I only see 7 seconds of CPU maxing out.

Arch Linux 64-bit
Version: 7.1.0.0.alpha0+
Build ID: 2db30aa0206ca3d9d5a665d550820d8fcbcff4b9
CPU threads: 8; OS: Linux 5.8; UI render: default; VCL: kf5
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Built on 3 September 2020
Comment 15 Christian Lehmann 2020-09-03 14:15:31 UTC
Last year's sanitized document without comments takes less than 10 seconds to load in my installation. It is therefore confirmed that large numbers of comments are part of the problem.
Again, there is no reason why a document that is being loaded can be presented for editing only after all the comments have been processed. This can easily be done in the background afterwards.
I attach the latest version of the same document by the name 'large_test_file.odt'. I left the comments in it since these are part of the problem. You may delete them after having confirmed the problem. I assume there are no malicious people among LO developers; but I cannot guess what somebody may do with these comments.
Comment 16 Christian Lehmann 2020-09-03 14:17:25 UTC
Created attachment 165096 [details]
odt file of 1.1 MB with many comments to test load time and LO freezing

This is an up-to-date version of the earlier attachment.
Comment 17 Telesto 2020-09-03 18:40:41 UTC
Memory usage is creeping up.. 

In cases like:
CTRL+A
Backspace
CTRL+Z

and 
In case of Pressing CTRL+ENTER say every say 5 pages

Assuming Automatic Spell Checker enabled


So if you're editing this file for a while I could get pretty bad
Did reach 800 MB pretty quick. With 6.1 it was only 390 MB. 
Memory wise the performance of 6.1 by far better compared to 7.1. However I'm comparing x32 with x64 builds and 7.1/7.0 has - as in my vision (known) - fontbox leaking memory problem. So no clue if there is only issue or multiple masking each other. 

About perf in general.. the comments appear to have the most impact, but that's a long standing known issue :-(. combined with a large file I tend which needs resources by itself, it becomes less optimal. 

The memory increasing surely will decrease performance over time (likely within a few hours). 

One solution is splitting the file up and using a Master Document; I think. LibreOffice can take 300-350 pages as a single document; after gets often problematic (crashes/ strange behavior/ slowdowns etc.). But also matter of content/ document complexity/ features used etc. Comment surely have negative impact (as said before)
Comment 18 Christian Lehmann 2020-09-03 20:00:27 UTC
I did check the option of a master document. It is useful if you have a volume whose chapters are mutually independent, like a collective volume composed of independent articles or a novel where you can write chapter 6 when you don't need to touch chapter 5 any more. My file which I uploaded is an intertwined text where a passage of chapter 11 may have to refer back to just any of the earlier chapters, and vice versa. Composing it presupposes that all the parts of the master document are loaded into it; but then you don't need the master document device in the first place.
Comment 19 Christian Lehmann 2021-09-22 08:07:26 UTC
The original issue (repeated screen rewriting while loading a document) is resolved in version 7.1.5.2.
A related issue was that loading a large document paused after counting up to page number 498 or so and only after a pause continued counting up to the end. This issue is now resolved, too. The document is now first loaded, the total page number shown and then made available for editing.
Resolved.