Bug 132212 - Text flow not correct after loading a large document
Summary: Text flow not correct after loading a large document
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.0.0.5 release
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks: Anchor-and-Text-Wrap VCL-Scheduler
  Show dependency treegraph
 
Reported: 2020-04-18 09:11 UTC by Thomas Neumann
Modified: 2020-06-05 23:24 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
image affected with image pop-up open showing the selected text - flow (39.02 KB, image/png)
2020-04-18 09:11 UTC, Thomas Neumann
Details
The file which creates the most problems (6.14 MB, application/vnd.oasis.opendocument.text)
2020-05-07 12:58 UTC, Thomas Neumann
Details
Bibisect log (2.44 KB, text/plain)
2020-05-10 15:57 UTC, Telesto
Details
Follow up Bibisect (2.49 KB, text/plain)
2020-05-10 16:00 UTC, Telesto
Details
Bibisect log (5.18 KB, text/plain)
2020-05-11 08:30 UTC, Telesto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Neumann 2020-04-18 09:11:26 UTC
Created attachment 159674 [details]
image affected with image pop-up open showing the selected text - flow

When I load larger documents, some immages will cover the text. The text flow is set either to 'flow around' or 'no text flow'. The wrong flow occurs also in a pdf created based on that document (so its not a display problem).
Opening the picture's pop-up menu shows the text-flow setting set to the correct value. Clicking it again will fix the problem for this picture. A problem fixed will not reoccure as long as the document is not reloaded.

After saving the document, closing and restarting Libreoffice, and reloading the document the error will occur again, but in most cases other images will be affected.
All images are png.

The amount of images affected is below 1%, but the problem had start occuring a long time ago (not sure on which version).

I attached a screenshot showing the problem (with German labels).

Tomy
Comment 1 Dieter 2020-05-07 11:33:19 UTC
Thomas, Thank you for reporting the bug. Could you please try to reproduce it with the latest version of LibreOffice from https://www.libreoffice.org/download/libreoffice-fresh/ ? I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the latest version. Change to RESOLVED WORKSFORME, if the problem went away.

If the problem is still present in LO fresh, please also attach a sample document, as this makes it easier for us to verify the bug. Thanks.
Comment 2 Dieter 2020-05-07 12:34:46 UTC
Thomas, you've changed status back to UNCONFIRMED without any explanation. Please add at lest informations from Help About LibreOffice. And - as I wrote before - Add a sample document if possible or ad some steps to reproduce

=> NEEDINFO
Comment 3 Telesto 2020-05-07 12:48:03 UTC
A sample document would really helpful ;
Comment 4 Thomas Neumann 2020-05-07 12:58:52 UTC
Created attachment 160498 [details]
The file which creates the most problems
Comment 5 Thomas Neumann 2020-05-07 13:00:00 UTC
Version: 6.4.3.2 (x64)
Build-ID: 747b5d0ebf89f41c860ec2a39efd7cb15b54f2d8
CPU-Threads: 4; BS: Windows 10.0 Build 18363; UI-Render: Standard; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: threaded

Well, I tried to find some pattern about occurence of the problem, but I'm not able to find one.
Here the information I was able to gather:

- The problem occurs on random pictures in the document

- It occurs in both settings used for text - flow ( parallel and none)

- The menu indicates the correct text - flow setting

- In about 80% simply clicking on the the text - flow entry in the menu will fix the problem.

- If this does not work, moving the image a little bit switches fixes the text -flow.

- Both way will fix for all images on a page, while not necesarrily all images on a page are affected.

- Deleting a whole page of the document fixed all text - flow problems at least of the pages after the page removed.

- The problem occurs both in the write and in the document exported as PDF. 

- I attach the document that shows the most issues. As mentioned before the problem occurs randomly. Last time I found a problem on page 207 ...
Comment 6 Telesto 2020-05-08 07:39:47 UTC
Confirming the issue on page 207
Version: 6.4.3.2
Build ID: 747b5d0ebf89f41c860ec2a39efd7cb15b54f2d8
CPU threads: 4; OS: Mac OS X 10.12.6; UI render: default; VCL: osx; 
Locale: nl-NL (nl_NL.UTF-8); UI-Language: en-US
Calc: threaded
Comment 7 Telesto 2020-05-08 08:33:00 UTC
also in
Versie: 6.1.6.3
Build ID: 5896ab1714085361c45cf540f76f60673dd96a72
CPU-threads: 4; Besturingssysteem: Windows 6.3; UI-render: standaard; 
Locale: nl-NL (nl_NL); Calc: group threaded

and in
Versie: 5.4.0.3 
Build ID: 7556cbc6811c9d992f4064ab9287069087d7f62c
CPU-threads: 4; Besturingssysteem:Windows 6.2; UI-render: standaard; 
Locale: nl-NL (nl_NL); Calc: group

and in
Version: 5.3.0.3
Build ID: 7074905676c47b82bbcfbea1aeefc84afe1c50e1
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: old; 
Locale: nl-NL (nl_NL); Calc: CL

also in
Version: 5.0.2.2
Build ID: 37b43f919e4de5eeaca9b9755ed688758a8251fe-GL
Locale: en-US (nl_NL)

Appears to be good in
Versie: 4.4.7.2 
Build ID: f3153a8b245191196a4b6b9abd1d0da16eead600
Locale: nl_NL
Comment 8 Aron Budea 2020-05-08 21:06:33 UTC
Already buggy in oldest commit of bibisect-linux-64-5.2, but fine in latest commit of bibisect-50max. Needs to be bibisected in Windows.
Comment 9 Telesto 2020-05-08 21:25:20 UTC
(In reply to Aron Budea from comment #8)
> Already buggy in oldest commit of bibisect-linux-64-5.2, but fine in latest
> commit of bibisect-50max. Needs to be bibisected in Windows.

How did you work it out.. it quite unpredictable.. I could confirm it in the morning.. I'm not able to reproduce anymore with the same 7.0 build
Comment 10 Aron Budea 2020-05-09 07:43:52 UTC
(In reply to Telesto from comment #9)
> How did you work it out.. it quite unpredictable.. I could confirm it in the
> morning.. I'm not able to reproduce anymore with the same 7.0 build
I just checked the versions at hand around the ones you identified as first bad / last good (and they aren't 7.0 builds).
Comment 11 Telesto 2020-05-10 15:57:57 UTC
Created attachment 160607 [details]
Bibisect log

Bibisect first round, not the expected result
Comment 12 Telesto 2020-05-10 16:00:04 UTC
Created attachment 160608 [details]
Follow up Bibisect

I skipped one commit.. but this sounds very very plausible.. 

Bisected to
author	Jan-Marek Glogowski <glogow@fbihome.de>	2015-03-06 19:11:31 +0100
committer	Jan-Marek Glogowski <glogow@fbihome.de>	2015-03-06 20:00:34 +0100
commit 9b4abcd1c45a646a1ac9120fe1c489ba6bb44e95 (patch)
tree 049271c64e942242bee5ffb325497a6a8127780a
parent acaafc03e623ac25d4408605f34d50618926c5d0 (diff)
Little build fix to Windows ScRefreshTimer
For whatever reason "objdump.exe -t xicontent.o" on Windows now
lists Start@ScRefreshTimer@@EAEXXZ (probably because it's now
a virtual function). Also the reference to SetRefreshDelay
vanished after dropping the virtual keyword from SetRefreshDelay.

The linux object file doesn't even refer to a ScRefreshTimer
function. Start() can stay private, as the timer is handled via
SetRefreshDelay.

Probably Stop() should also become private in this case.

This also reverts 2c0189a8a3aeb3668bf6de1ea1958ba475b80a38
Comment 13 Telesto 2020-05-10 16:06:01 UTC
Adding CC: to Jan-Marek Glogowski

Can't decide between high or highest.. let's go for highest, critical
* This has a large impact on the layout (number of pages) 366 vs 477. 
* Text hidden behind shapes.. not only on screen but also on export
* Randomness of the issue
* It's a regression
Comment 15 Aron Budea 2020-05-11 01:37:32 UTC
You're right, this happens much more randomly than I thought at first, and it's indeed in 5.0.0.5 already.

(In reply to Telesto from comment #14)
> https://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=9b4abcd1c45a646a1ac9120fe1c489ba6bb44e95
However, this can't be the right commit, the change is in sc, which is Calc-specific. Let's leave bibisectRequest up.

I've also reduced priority/severity, the layout is certainly bad, but if the bug hasn't been found for 5 years, it must be some rare corner case.
Comment 16 Telesto 2020-05-11 07:34:52 UTC
(In reply to Aron Budea from comment #15)
> I've also reduced priority/severity, the layout is certainly bad, but if the
> bug hasn't been found for 5 years, it must be some rare corner case.

Not so sure.. 

1. It's not only the images... but i would also explain the page count difference in file open... seen in this document.. but there are more of those.. and i didn't understand why it happened... 

i didn't bibisect it on 1 image, but had to look for an issue... bad are sure goods assumptions

2. Page flow is essential and tricky aspect

2. Observe the document having totally different layout... between 4472 and 7

3. And the randomness.. Try to find a bug doc demonstrating this type of issue consistently enough to find a pattern

A little downgrade until commit is identified, ok.. but surely more than medium major.. being 5 years around is rather horrific idea.. especially if i'm right with the possible scope.. meaning 5 years a broken document in a unpredictable quirky way.
Comment 17 Telesto 2020-05-11 08:30:37 UTC
Created attachment 160648 [details]
Bibisect log

Tentative bibisect.. based only on page numbers.. not on overlapping text.. however.. the 366 pages document shows this issue quite often..'

And surprise surprise.. Still scheduler/timers.. Tobias Madl
Comment 18 Telesto 2020-05-11 08:35:57 UTC
@Jan-Marek. 

Adding again.. this is related to timer/scheduler anyhow.
* The bug doc has 366 pages after bunch of Timer/scheduler changes by Tobias Madl. Before 469/471
* There are cases of text behind images.. which likely correlate with number of pages (however didn't notice the issue in some cases; or didn't look good enough for them)..
Comment 19 Telesto 2020-05-11 08:38:16 UTC
Bumping priority slightly.. would like a look the timer/scheduler expert himself ;-)
Comment 20 Jan-Marek Glogowski 2020-05-12 18:14:42 UTC
(In reply to Telesto from comment #18)
> @Jan-Marek. 
> 
> Adding again.. this is related to timer/scheduler anyhow.

Maybe, but more likely it's Writer's idle layouter to blame. It's probably not a problem of the scheduler per-se.

> * The bug doc has 366 pages after bunch of Timer/scheduler changes by Tobias
> Madl. Before 469/471
> * There are cases of text behind images.. which likely correlate with number
> of pages (however didn't notice the issue in some cases; or didn't look good
> enough for them)..

So I opened the attached document. What I see is:

* Generally, on open, the TOC has the last entry as "Release information - 363".
* The TOC changes on TOC update (context menu -> "Update Index/Table").
* LO 4.4.7:
  * On open, LO claims the document has 467 pages
  * After TOC update: 371 pages; last TOC entry is "Release information - 366"
  * The layout will sometimes be updated by scrolling with PgUp / PgDn, if LO detects a wrong layout
* LO 7.0 / master a few days old:
  * On open, the document instantly shows as 367 pages
  * After TOC update: still 367 pages; last TOC entry is "Release information - 362"

So we have 371 vs 367, which is IMHO still a lot, but currently I would just claim bug fixes and better layout, eventually from changing to Harfbuzz test rendering or our own fixes.

Some observation:
* A particular bad page here is 23. In 4.4 it just contains a single line here, which moves to page 22 in 7.0.
* I also saw some overlapping image problems in 4.4, even after triggering the TOC update. But generally they are hard to spot here.

So currently my conclusion is that 4.4 is more buggy then current master w.r.t this document. The initial scheduler rewrite might even have fixed some stuff. Actually I don't see a problem here with master.

Eventually related bug: there was bug 123257 / bug 119748 / bug 131707, which is hopefully finally fixed in master since 2020-04-03, which originally looked like a scheduler bug, but turned out to be a Writer layouter bug. The exact triggering condition was indeed the flow setting of the images, which was ignored in some obscure circumstances, I still don't fully understand. The default setting in the code is flow-through, so that's why the bug manifested (actually that is good, as these bugs would go unnoticed otherwise, I guess).

BTW: the fix is small, but since 6.4 is almost still and layout changes tend to regress, no backport was done. 

P.S. it still might be a scheduler bug on Windows, since I've done all this on Linux.
Comment 21 Telesto 2020-05-12 19:25:38 UTC
@Jan-Marek
Thanks for all the insights! 

1. How to get final layout with 4.4.7.2. Didn't think of updating the TOC
2. The default setting in the code is flow-through, so that's why the bug manifested
-> Ah, that's really helpful insight.. I have even a bug doc setup/demonstrating this behavior ;-) Bug 132415 (unconfirmed) appears be caused by anchoring to character.. [the default for images nowadays]. Wrap optimal/left won't work. Also found in 6.4.3 on in 4.4.7.2 and in 3.5.7.2 [only a subtle hint; feel free to ignore :-)

And for the record, I looked at the bug doc on master again.. can't find any issue using 7.0 alpha 1. So - assuming you did this - thanks for the solving the issue ;-)
Comment 22 Xisco Faulí 2020-05-14 08:10:12 UTC
(In reply to Telesto from comment #21)
> And for the record, I looked at the bug doc on master again.. can't find any
> issue using 7.0 alpha 1. So - assuming you did this - thanks for the solving
> the issue ;-)

Let's close it as RESOLVED WORKSFORME then