Bug 155326 - Calc with gtk3 VCL is unusable with large spreadsheets: huge lag, high CPU usage, etc.
Summary: Calc with gtk3 VCL is unusable with large spreadsheets: huge lag, high CPU us...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.1.0.0.alpha1+
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, perf
Depends on:
Blocks: Performance
  Show dependency treegraph
 
Reported: 2023-05-15 13:52 UTC by q_user
Modified: 2023-07-17 13:16 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Example file with lag. (175.56 KB, application/vnd.oasis.opendocument.spreadsheet)
2023-06-07 14:39 UTC, q_user
Details
recorded perf data with example calc file (250.78 KB, application/gzip)
2023-06-25 04:32 UTC, q_user
Details

Note You need to log in before you can comment on or make changes to this bug.
Description q_user 2023-05-15 13:52:51 UTC
Description:
Working on large (but not "crazy large") spreadsheets is *impossible* with the gtk3 VCL plugin. Even on recent hardware, selecting cells takes up to a few seconds and refresh rate is abysmal when scrolling (eg. 1 row / second), as rows continue to flash up/down up to a few seconds after I stopped scrolling, etc.

Neither 'gen/x11' nor 'kf5' have that issue so there's clearly an issue with gtk3.

Bisecting/installing LO versions from LO download archives show the bug was introduced in 7.1.0.0.alpha1 (7.0.6.2 is the last version that worked well).

Steps to Reproduce:
1. open a large spreadsheet with Calc and the gtk3 VCL
2. scroll (mouse wheel, keys, ...)
3. drink a coffee while rows continue to flash up/down long after the user has stopped interacting with Calc :) I'm kidding, but it really takes up to a few seconds to update.

Actual Results:
Lag.

Expected Results:
No lag.


Reproducible: Always


User Profile Reset: Yes

Additional Info:
OS: Qubes OS R4.1 [1]: XEN virtual machines with non-accelerated graphics (llvmpipe) on Xorg.

Issue reported by users of the Qubes OS forum [2]

LO 7.0.6.2 works well, 7.1.0.0.alpha1 doesn't.

no luck with safe mode, turning acceleration on/off, resetting the user profile, etc.

neither 'gen' nor 'kf5' vcl plugins have that issue (so a workaround is to use them instead of gtk3 but gen is ugly on my hw and kf5 has a few glitches).

Similar bug reports:

- https://bugs.documentfoundation.org/show_bug.cgi?id=152657
- https://bugs.documentfoundation.org/show_bug.cgi?id=144033
- https://bugs.documentfoundation.org/show_bug.cgi?id=145631

I'm filing this bug because the bugs above specifically mention nvidia as the likely culprit, but they could actually be duplicates. Either way it seems to be an issue with graphics acceleration - or lack thereof. I initially thought the issue could be because of a Mesa update but testing with version 20.x up to 23.x on fedora 36, 37, debian stable, unstable, showed it wasn't the case [3].


[1] https://www.qubes-os.org/
[2] https://forum.qubes-os.org/t/100-cpu-with-every-scroll-in-libreoffice/8027/
[3] https://forum.qubes-os.org/t/100-cpu-with-every-scroll-in-libreoffice/8027/58 (toggled content)
Comment 1 Stéphane Guillou (stragu) 2023-05-15 14:19:43 UTC
Thank you for the report.

The first version affected makes it sound similar to bug 152657.
Can you please test with the formula bar hidden, and see if it changes anything? View > Formula bar.
Comment 2 q_user 2023-05-15 14:52:42 UTC
Hmm, I somehow overlooked your post mentioning the formula bar in #152657 - sorry about that.

Hiding the bar produces mixed results:

- on fedora 37, LO 7.4.6.2 (shipped by fedora) - it's worse: scrolling totally freezes Calc for a few seconds (Xorg @100% cpu usage - on a 2 VCPU virtual machine)

- on fedora 37, LO 7.1.0.0-alpha1 (from LO dl archive) - OK (not super smooth but much better).
Comment 3 Julien Nabet 2023-05-15 16:01:06 UTC
First I thought about Skia (see https://wiki.documentfoundation.org/QA/FirstSteps#Graphics-related_issues_(_Skia_)) but then you're talking about gtk, so may be related to accessibility.
Do you use any accessibility tool?

In your last comment, you talked about 7.1.0.0-alpha1, did you mean 7.6.0.0 since it's the last dev version?
Comment 4 q_user 2023-05-15 17:18:33 UTC
(In reply to Julien Nabet from comment #3)
> First I thought about Skia (see
> https://wiki.documentfoundation.org/QA/FirstSteps#Graphics-
> related_issues_(_Skia_)) but then you're talking about gtk, so may be
> related to accessibility.
> Do you use any accessibility tool?

No, the VM template is really streamlined - just enough packages/functionality to run LO, so no accessibility stuff.

> In your last comment, you talked about 7.1.0.0-alpha1, did you mean 7.6.0.0
> since it's the last dev version?

No, I really meant 7.1.0.0-alpha1. 7.0.6.2 was the latest version that worked well. In the download archive [1] 7.1.0.0-alpha1 is the next version, where the lag with gtk3 became noticeable.


[1] https://downloadarchive.documentfoundation.org/libreoffice/old/
Comment 5 q_user 2023-05-15 17:44:15 UTC
FWIW Calc is even slower in 7.6.0.0.alpha1: it takes 50 seconds (!) to "stabilize" after the down arrow key was pressed for 5 seconds.
Comment 6 Julien Nabet 2023-05-15 17:50:55 UTC
Just to be sure it's not Skia related, could you test https://wiki.documentfoundation.org/QA/FirstSteps#Graphics-related_issues_(_Skia_) ?
Comment 7 q_user 2023-05-16 04:20:17 UTC
I've tested it - no change.

(UI render = default in help->about, with or without SAL_SKIA=raster)
Comment 8 Julien Nabet 2023-05-16 17:10:46 UTC
Thank you for your feedback.
I can't help here=>uncc myself.
Comment 9 Telesto 2023-05-18 12:40:27 UTC
@q_user,
Are you using a 4k monitor?
Comment 10 q_user 2023-05-18 12:42:18 UTC
(In reply to Telesto from comment #9)
> @q_user,
> Are you using a 4k monitor?

no, I have a 1900x1200 monitor...
Comment 11 Telesto 2023-05-18 15:07:44 UTC
(In reply to q_user from comment #10)
> (In reply to Telesto from comment #9)
> > @q_user,
> > Are you using a 4k monitor?
> 
> no, I have a 1900x1200 monitor...

Thanks for the quick reply! I assumed, maybe bug 154602 being related. However 4k is often mentioned in those cases.
Comment 12 q_user 2023-05-18 16:20:29 UTC
(In reply to Telesto from comment #11)
> I assumed, maybe bug 154602 being related.
> However 4k is often mentioned in those cases.

Indeed it seems to be the same issue - but without a high dpi monitor.

It's quite a coincidence though so those bugs could very well have the same root cause: eg. while it could take a 4k monitor with proper graphics acceleration for people to notice the lag, the effect could be noticeable with a standard HD monitor and *no* acceleration (my case + the other bugs I've linked to).

When faced with this issue, non-technical people usually either switch OSes and/or office suite (and usually have quite a negative overal impression), or say nothing and think that it's normal. User @phl summed it up quite well in [1]...
Anyway, I'd be happy to do whatever tests you can think of to find the root cause.

[1] https://forum.qubes-os.org/t/100-cpu-with-every-scroll-in-libreoffice/8027/51
Comment 13 q_user 2023-06-04 06:48:02 UTC
Hey guys,

I realize it's not very polite to "bump" bug reports - but I'm doing so as it's not just a one-off thing, there's a community of 40,000+ Qubes OS users [1].

I'm obviously not writing on behalf of all those users, but I'm pretty sure they:

- haven't noticed the issue because they don't use Calc, or they use small spreadsheets with recent hardware where the issue isn't too problematic

- or, they have found the instructions for the workaround and fixed the issue by switching to KF5 (/KDE) or gen [2].

- or, they haven't encountered the bug *yet*: Qubes OS users can choose between Fedora (38 at the time of writing) or Debian 11 stable templates. The version of LO in the Debian template is older than the first version that introduced the bug, but eventually the template will be updated to a more recent version of Debian and then everybody will have a "broken" LO version.

- or, they have switched to another distribution after finding out it was impossible to work with LO (no idea about how many users this amounts to).

So - is there any way to help with testing? If LO resources are stretched too thin and there isn't much you can do, I would file an issue in Qubes OS so that the KF5/gen workaround is the default in the official templates (the policy is usually to avoid making changes to the distribution). Again, just to be clear - I'm not complaining nor pushing you - I'm just trying to find a long-term solution - whether that's integrating the workaround in Qubes OS templates, or fixing the issue upstream - in LO.

Thanks !

[1] https://www.qubes-os.org/statistics/ 
[2] https://forum.qubes-os.org/t/100-cpu-with-every-scroll-in-libreoffice/8027
Comment 14 itagure 2023-06-06 19:11:07 UTC
Hallo,
I was able to reproduce the observed behavior with a worksheet of 100 columns and 50000 rows (only static numbers).

When scrolling down with "page down" from the first row, there is a lag of about 1 second between the point in time when you stop pressing "page down" and the moment the scrolling actually stops.
The lag is smaller if formula bar is not visible, however is still noticeable.

The lag is noticeably smaller if you try to scroll down from a cell that is near the bottom of the 50k rows or if you want to scroll up starting from a cell near the top of the worksheet (let's say about 5k rows or less in either direction).
Therefore it seems that the lag is related to the number of rows "remaining" in the direction of scrolling.

Scrolling works without noticeable lags when using the scroll-bar.



my setup is:

Version: 7.5.3.2 (X86_64)
Build ID: 50(Build:2)
CPU threads: 12; OS: Linux 6.3; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Monitor: 2560x1440 (100% scale)
Fedora 38
Intel Integrated Graphics
Comment 15 Stéphane Guillou (stragu) 2023-06-07 10:30:52 UTC
q_user, it would be great if you could bisect to a precise commit with the 7.1 bibisect repository. Instructions are here: https://wiki.documentfoundation.org/QA/Bibisect/Linux

Please also provide an example file so we know we are testing in similar conditions.
Comment 16 q_user 2023-06-07 14:04:30 UTC
Hi Stephane,

Thank you for looking into this.

I followed the bibisect instructions. `git checkout oldest` returned `error: pathspec 'oldest' did not match any file(s) known to git`, so I used the first commit instead like so: `git bisect start origin/master 36741205b2e1c9e51d58dff4d0b4ce9022013411`

Here's the result; I didn't understand what you meant with "example file" by the way - just let me know what I have to attach/paste and how I can help further...

 dc9b09a432399053d2e161059784484250f71620 is the first bad commit
commit dc9b09a432399053d2e161059784484250f71620
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Fri Oct 16 17:12:49 2020 +0200

    source e087e25f05e689091cbf1c4f91b6e93878ac17ec

    source e087e25f05e689091cbf1c4f91b6e93878ac17ec

 instdir/program/libeditenglo.so                    | Bin 3214144 -> 3214208 bytes
 instdir/program/libsclo.so                         | Bin 22072480 -> 22040912 bytes
 instdir/program/setuprc                            |   2 +-
 instdir/program/versionrc                          |   2 +-
 instdir/share/config/images_breeze.zip             | Bin 1886356 -> 1886356 bytes
 instdir/share/config/images_breeze_dark.zip        | Bin 1882021 -> 1882021 bytes
 instdir/share/config/images_breeze_dark_svg.zip    | Bin 1565375 -> 1565375 bytes
 instdir/share/config/images_breeze_svg.zip         | Bin 1562898 -> 1562898 bytes
 instdir/share/config/images_colibre.zip            | Bin 2771159 -> 2771159 bytes
 instdir/share/config/images_colibre_svg.zip        | Bin 2864526 -> 2864526 bytes
 instdir/share/config/images_elementary.zip         | Bin 4065209 -> 4065209 bytes
 instdir/share/config/images_elementary_svg.zip     | Bin 5060844 -> 5060844 bytes
 instdir/share/config/images_karasa_jaga.zip        | Bin 4882332 -> 4882332 bytes
 instdir/share/config/images_karasa_jaga_svg.zip    | Bin 19311288 -> 19311288 bytes
 instdir/share/config/images_sifr.zip               | Bin 2102888 -> 2102888 bytes
 instdir/share/config/images_sifr_dark.zip          | Bin 2104764 -> 2104764 bytes
 instdir/share/config/images_sifr_dark_svg.zip      | Bin 1754627 -> 1754627 bytes
 instdir/share/config/images_sifr_svg.zip           | Bin 1750762 -> 1750762 bytes
 instdir/share/config/images_sukapura.zip           | Bin 3041266 -> 3041266 bytes
 instdir/share/config/images_sukapura_svg.zip       | Bin 4346455 -> 4346455 bytes
 .../soffice.cfg/modules/scalc/ui/inputbar.ui       | 122 +++++++++++++++++++++
 21 files changed, 124 insertions(+), 2 deletions(-)
 create mode 100644 instdir/share/config/soffice.cfg/modules/scalc/ui/inputbar.ui
Comment 17 Stéphane Guillou (stragu) 2023-06-07 14:30:34 UTC
Thank you. So you've bibisected the issue back to the same commit as for bug 152657:

commit e087e25f05e689091cbf1c4f91b6e93878ac17ec
author	Caolán McNamara <caolanm@redhat.com>	Mon Oct 05 14:19:05 2020 +0100
committer	Caolán McNamara <caolanm@redhat.com>	Fri Oct 16 12:54:14 2020 +0200
weld InputBar
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/104037

@Caolán: apart from testing a build with the CairoCommon.cxx line commented out as requested in bug 152657 comment 15, is there anything else we could provide to help figure out what is going on?

@q_user, regarding your comment:

(In reply to q_user from comment #2)
> Hiding the bar produces mixed results:
> 
> - on fedora 37, LO 7.4.6.2 (shipped by fedora) - it's worse: scrolling
> totally freezes Calc for a few seconds (Xorg @100% cpu usage - on a 2 VCPU
> virtual machine)
> 
> - on fedora 37, LO 7.1.0.0-alpha1 (from LO dl archive) - OK (not super
> smooth but much better).

... there might be more than that 7.1 regression to it.

Regarding the "example file" I requested, I was asking for an example ODS file in which the scrolling issue is evident (or the one you used when bibisecting).
You can attach it to this report with the "add an attachment" link.
Comment 18 q_user 2023-06-07 14:39:28 UTC
Created attachment 187766 [details]
Example file with lag.

(In reply to Stéphane Guillou (stragu) from comment #17)
> Thank you. So you've bibisected the issue back to the same commit as for bug
> 152657:

Yes - I saw it was the "weld inputbar" thing that IIRC you had mentioned.

> Regarding the "example file" I requested, I was asking for an example ODS
> file in which the scrolling issue is evident (or the one you used when
> bibisecting).
> You can attach it to this report with the "add an attachment" link.

Ah, got it. Attaching...
Comment 19 Caolán McNamara 2023-06-24 15:42:02 UTC
It will need someone who can reproduce it to debug it, the https://bugs.documentfoundation.org/show_bug.cgi?id=152657#c15 thing is probably still an option to test.

Follow the "Performance debugging (perf)" of https://wiki.documentfoundation.org/Development/How_to_debug could provide useful information.

otherwise its sadly just "its slow on my computer" and that's not really actionable
Comment 20 q_user 2023-06-25 04:32:41 UTC
Created attachment 188081 [details]
recorded perf data with example calc file

(In reply to Caolán McNamara from comment #19)
> It will need someone who can reproduce it to debug it, the
> https://bugs.documentfoundation.org/show_bug.cgi?id=152657#c15 thing is
> probably still an option to test.

No improvement at all with the line in question commented.

> Follow the "Performance debugging (perf)" of
> https://wiki.documentfoundation.org/Development/How_to_debug could provide
> useful information.

OK, so I've built LO like so:

./autogen.sh --enable-pch --enable-symbols

Hopefully those options are OK: "--enable-pch" was advised in the BuildingOnLinux doc to speed up the build, and the perf doc mentioned that it required only symbols, not a full debug build, hence only "--enable-symbols", without "--enable-dbgutil").

Then:
 
 - started Calc with the test .ods file attached earlier
 - perf record -g --pid=$(pidof soffice.bin)
 - pressed the down arrow key for 5 seconds
 - stopped recording once Calc stabilized/stopped scrolling (25 seconds).

total recording time: ~30s.

No problem to do more tests / re-build if the recored data isn't conclusive - just let me know what you'd like me to do.
Comment 21 Stéphane Guillou (stragu) 2023-07-17 13:16:33 UTC
Thanks, q_user.

Caolán, is attachment 188081 [details] useful at all?