Bug 137139 - Crashes in Kubuntu with 6.4.6: ScSelectionTransferObj:com::sun::star::uno::Reference:Qt5MimeData
Summary: Crashes in Kubuntu with 6.4.6: ScSelectionTransferObj:com::sun::star::uno::Re...
Status: RESOLVED DUPLICATE of bug 140700
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.4.6.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: KDE, KF5
  Show dependency treegraph
 
Reported: 2020-09-29 19:14 UTC by Heather Ellsworth
Modified: 2021-03-08 09:47 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
rhbz#1847031 cleaned crash bt (19.88 KB, text/plain)
2020-09-30 12:33 UTC, Jan-Marek Glogowski
Details
Kubuntu shutdown crash bt (13.50 KB, text/plain)
2020-09-30 13:10 UTC, Jan-Marek Glogowski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Heather Ellsworth 2020-09-29 19:14:21 UTC
Description:
In Kubuntu 20.04 there have been some cases of the following stacktrace and threadtrace (reported anonymously in the user's background) in 6.4.3, 6.4.4, and 6.4.5 but not too many really (~40 per release).

However, there has been a very large increase in the same error being reported (~17k) since users have been updating to 6.4.6.

Could someone please take a look at the following stacktrace and threadtrace outputs? It's great to get these errors reported anonymously to help find real issues, but the downside is I have no idea what the user was doing when this crash occurred.

Stacktrace: https://paste.ubuntu.com/p/Dgw89Hfpqb/
Thread Stacktrace: https://paste.ubuntu.com/p/nVvmPjxZm2/ 


There is also an associated launchpad bug: https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/1897784

Steps to Reproduce:
Unsure what the user is doing when the crash occurs.

Actual Results:
It looks like libreoffice crashes because of this at the top of the stacktrace:
#1  0x00007fc9721f1a9a in KCrash::defaultCrashHandler (sig=11) at ./src/kcrash.cpp:582
        crashRecursionCounter = 2

Expected Results:
Libreoffice should not crash.


Reproducible: Always


User Profile Reset: No



Additional Info:
version is: 6.4.6-0ubuntu0.20.04.1 which is just repackaged 6.4.6.2
Comment 1 Julien Nabet 2020-09-29 19:27:29 UTC
Michael: thought you might be interested in this one. Perhaps it's already fixed in master or 7.0 branch and just need some backports for 6.4.7. Of course, if we still got the time considering https://wiki.documentfoundation.org/ReleasePlan/6.4...
Comment 2 Michael Weghorn 2020-09-30 06:47:11 UTC
The only relatively recent commit related to QMimeData handling (i.e. mostly copy & paste and drag'n'drop) is the fix for tdf#131533, but that fix is in 6.4.4 already.

I'm wondering whether the increase of reported crashes is actually due to some change in LibreOffice qt5/kf5 code or rather the side-effect of something completely different (like more users having installed Kubuntu 20.04 since then, or some completely different commit that changed timing in some place, ...).

@jmux: Any ideas?
Comment 3 Jan-Marek Glogowski 2020-09-30 11:23:27 UTC
No. (In reply to Michael Weghorn from comment #2)
> The only relatively recent commit related to QMimeData handling (i.e. mostly
> copy & paste and drag'n'drop) is the fix for tdf#131533, but that fix is in
> 6.4.4 already.
> 
> @jmux: Any ideas?

We know the fix for tdf#131533 is rather fishy and more of a workaround, then a real fix. But if we have the same backtraces before 6.4.4 (and I checked that libreoffice_6.3.3-0ubuntu1.debian.tar.xz doesn't carry that fix), we can - kind of - rule that out. Also all the backtraces just indicate Calc (ScSelectionTransferObj), not any other LO modules (at least Writer), which is strange. Maybe an active selection in Calc is just more common on shutdown, then in other applications? OTOH a copy in Calc keeps the selection, even if you move the "cursor rectangle". There is also nothing suspicious new on https://crashreport.libreoffice.org/stats/version/6.4.6.2

And the only "fix" in vcl/qt5 between 6.4.2 and 6.4.3 is commit 1000169ebca79478a05b4c23e760d99bd77e739e ("Qt5 unify font attribute conversions"), which I would also rule out as the origin of any bug (famous last words).

There is also no other fix in vcl/ or sc/ since commit cbac26c52ccbe59c51c6631cb8c4b0a314a9848a / 6.4.0, which looks suspicious at all w.r.t. the backtrace at a first glance. And the whole lazy clipboard stuff was already in 6.3. And your fix for tdf#129809 was already in 6.4.2.

Still nothing would explain the current peak, I can see in https://errors.ubuntu.com since https://errors.ubuntu.com/?release=Ubuntu%2020.04&package=libreoffice-core&from=2020-08-24&to=2020-09-25, which matches the publishing date from https://launchpad.net/ubuntu/+source/libreoffice/1:6.4.6-0ubuntu0.20.04.1 .

So I checked the diff of the 6.4.6 fix range for sc and interestingly it contains commit 8718d243edc9400b0e0131b096702af8d33df327 (rhbz#1847031 null-deref), which fixes a null-deref in ~ScTransferOb. That should really just fix a bug, not cause one, but eventually this just papers-over some real bug in the other fixes, which now hits Qt in some way. Because ScTransferObj is part of libsclo, which can't exists with "pScMod == nullptr".

$ git log --pretty=oneline f2e448175cee92fc695413e7281223e9f23e30ee~1..origin/libreoffice-6-4-6 | wc -l
123

Now I really would like to have a reproducer, which eventually should also happen on master... and tdf#130559 is the only additional thing in sc that looks strange, but that's just 17 out of the 123 patches.

Someone from Kubuntu could "mine" the crash reports, if there is anything reproducible in it. And I have to get the RHEL bug report from someone (Caolan, eventually).

So while I first suspected something non-LO, it now smells like a LO bug, independent of VCL, evetually.
Comment 4 Jan-Marek Glogowski 2020-09-30 12:33:07 UTC
Created attachment 165976 [details]
rhbz#1847031 cleaned crash bt

That whole bt looks broken. RH has no reproducer either, so the fix is just a guess. Version was libreoffice-6.4.4.2-2.fc32.x86_64, as you can see in the bt paths.

The bt itself looks "wrong". The user did a right-click on a Calc cell, that would open the context menu, but at that point, when the Gtk event is processed, a lot more stuff is already "gone", like all the aGuard objects have nullptr, so no mutex in them. So I guess this is just for reference, but won't help.
Comment 5 Jan-Marek Glogowski 2020-09-30 13:10:16 UTC
Created attachment 165978 [details]
Kubuntu shutdown crash bt

It appears that one can just DL the text with a Launchpad login. So this is just a copy from the BT as reference. For whatever reason the clipboard object is still active, while the module is already gone, which is causing the crash.

An other question is, what other packages were published that day, just in the case it's eventually no LO bug (I still think it is).
Comment 6 Heather Ellsworth 2020-09-30 17:00:30 UTC
It's hard to pin down what other new packages could have made it onto the users system because of the way Ubuntu rolls out releases in a "phased" manner:

https://wiki.ubuntu.com/StableReleaseUpdates#Phasing

This is happening simultaneously for all sorts of packages too, at various stages in the phasing process. So unfortunately, there's no way to really get a sense of other new packages that might be causing the issue.
Comment 7 Jan-Marek Glogowski 2020-10-05 10:15:26 UTC
FWIW: I have installed Ubuntu's LO 6.4.6 in my focal schroot. I'm running Debian Buster with KDE on the host in X11. I couldn't produce any crash with the LO in the chroot, doing Calc selections and copy and paste operations, D'n'D and also some external copy actions.
Comment 8 Olivier Tilloy 2020-10-21 21:15:19 UTC
This bug looks similar to https://bugs.documentfoundation.org/show_bug.cgi?id=131083, which the reporter could initially reproduce quite reliably, but ended up closing after they couldn't reproduce any longer. This was originally reported against 6.4.0.3, and wasn't observed in 7.0.0.0.alpha0+.

In the errors.ubuntu.com tracker, the earliest version exhibiting the crash is 6.4.3, and there are new reports daily with 6.4.6.
Comment 9 Michael Weghorn 2020-10-22 06:37:11 UTC
(In reply to Olivier Tilloy from comment #8)
> This bug looks similar to
> https://bugs.documentfoundation.org/show_bug.cgi?id=131083, which the
> reporter could initially reproduce quite reliably, but ended up closing
> after they couldn't reproduce any longer. This was originally reported
> against 6.4.0.3, and wasn't observed in 7.0.0.0.alpha0+.
> 
> In the errors.ubuntu.com tracker, the earliest version exhibiting the crash
> is 6.4.3, and there are new reports daily with 6.4.6.

Indeed. The backtrace there seems to be comparable, s. tdf#131083 comment 22 (and the full valgrind log, attachment 158370 [details]).

According to tdf#131083 comment 26, the reporter was able to reproduce more or less reliably in his initial installation, but not in a fresh one afterwards, suggesting that *might* have been some issue with the installation.

I'm afraid there is probably little we can do without knowing how to reproduce or whether it even happens at all with newer LO versions...
Comment 10 Jan-Marek Glogowski 2020-10-22 07:54:10 UTC
(In reply to Olivier Tilloy from comment #8)
> This bug looks similar to
> https://bugs.documentfoundation.org/show_bug.cgi?id=131083

Nice find. Summary of bug 131083:

Happened reproducible for the user with LO Ubuntu build 6.4.0-0ubuntu7 and LO Tinderbox build 45ca47ac39c03df4de52d627a764f16068b1eab0 (7.0.0.0.alpha0+ is reported for all development builds before branch off / alpha1 / beta) and just with kf5 in KDE / Plasma 2, not with Xubuntu (Xfce? which VCL in About? I would assume gtk3). Happens with the empty / default document.

Bug 131083 comment 13 states: "I'm testing both Kubuntu and Xubuntu in a VM environment (VirtualBox 6.1.4)."

You can get a comparable (but not equal!) build to the Tinderbox one via https://bibisect.libreoffice.org/linux-64-7.0 commit fc089b4dda133f1fd03922b713708e53af6c16fa.

Reproduce:

1. start Calc
2. mark F10/11 vertically
3. exit LO
4. choose not to save

Questions for each point (which might be related or not):

1. How is Calc started? Desktop link, command line (which one?), via start center? Does it happen with non-default profile via  -env:UserInstallation=file:///tmp/test?
2. How are the cells marked? Mouse only, keyboard? Which cell first? Does it crash with other cells?
3. Exit crash just happens via window decorations? Or also via menu? Or by using Alt+F4 or Ctrl+Q? Or also by closing the Calc document and then the start center?
4. Why does Calc think the document was modified?

Point 4 is AFAIK already exposing the real bug.

General question: any additional output in a terminal? New entries in ~/.xsession-errors?
Is this just happening with the localized KDE (the reporter uses de-DE.UTF-8)?

An other eventually related crash I found (fixed?) is bug 131533 in LO 6.4.4. The implemented fix / patch is just a workaround, as I couldn't come up with a minimal reproducer without LO. There are some additional comments (see last ones) with info in https://gerrit.libreoffice.org/c/core/+/90990.

According to bug 131083 comment 15, an old LO profile doesn't seem to be the cause of this, as the crash could be reproduced after deleting the profile. Since it didn't crash with the first try, it still might somehow be related. Would still be nice to get a known crashing profile for testing / reference, in the case there is some setting in it, leading to the crash.
Bug 131083 comment 26 states the problem to reproduce the bug. "Made the installed DEB packages identical" I guess just means the package list, like "dpkg --get-selections", not the exact versions.

An other bug that just comes to my mind and which was actually fixed for 6.4.0 is bug 104717. That changes the selection / clipboard handling in Calc. Quoting from my commit message: "Calc also removes the system selection when clearing the selection. Other applications keep the primary selection valid, until the application or document closes, so do the same in Calc." So maybe this introduces some case, where the primary selection isn't correctly cleared on shutdown. This is not KDE specific, but the QClipboard handling / API is very different from either Gtk+ or X11, so there might be some broken case now, where we don't clean the selection on module shutdown.

Can you see, if the Kubuntu crashes happen on KDE shutdown, AKA session management related?

> In the errors.ubuntu.com tracker, the earliest version exhibiting the crash
> is 6.4.3, and there are new reports daily with 6.4.6.

The thing we definitely know is that the LO 1:6.4.6-0ubuntu0.20.04.1 build in combination with KDE makes the crash much more likely. It seems the original bug is older. 

Still we don't have any reproducer, just a huge amount of reported crashes from Kubuntu users.
Comment 11 mgruber 2020-10-22 08:29:51 UTC
I'm the original reporter of bug 131083.

Maybe I can shed some light, though I'm again not able to reproduce it in a fresh Kubuntu Focal with 6.4.6.2 (made about 20 attempts).

> 1. How is Calc started?

I started it via the KDE start menu.

> 2. How are the cells marked? Mouse only, keyboard?

I used mouse only.

> Which cell first?

Click on F10, then drag down to F11 to mark both.

> Does it crash with other cells?

AFAIR yes, it's the marking that created the problem.
I just used F10/F11 as reproducable test path.

> 3. Exit crash just happens via window decorations? Or also via menu?
> Or by using Alt+F4 or Ctrl+Q? Or also by closing the Calc document
> and then the start center?

AFAIR I only used the X button on the window decoration.

4. Why does Calc think the document was modified?

At least for my case that was a misconception when I initially created crash.ods as potential test case, see comment 24 in my original bug report.
Comment 12 Michael Weghorn 2021-03-08 09:47:10 UTC
This is most probably a duplicate of tdf#140700, which has steps to reproduce, and is fixed in master, backports for 7.0 and 7.1 pending.

*** This bug has been marked as a duplicate of bug 140700 ***