Bug Hunting Session
Bug 100151 - Crash at the end of slideshow or previews in Impress (in remote desktop session, for workaround see comment 25)
Summary: Crash at the end of slideshow or previews in Impress (in remote desktop sessi...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: graphics stack (show other bugs)
Version:
(earliest affected)
5.1.3.2 release
Hardware: All Windows (All)
: high major
Assignee: Aron Budea
URL:
Whiteboard: target:5.4.0 target:5.3.0 target:5.2.6
Keywords: bibisected, haveBacktrace, regression
: 95015 96539 96575 96876 98684 101584 103034 103823 (view as bug list)
Depends on:
Blocks: VclPtr
  Show dependency treegraph
 
Reported: 2016-05-31 05:09 UTC by raal
Modified: 2017-01-25 22:48 UTC (History)
15 users (show)

See Also:
Crash report or crash signature: ["vcl::Window::dispose()"]


Attachments
test file (30.83 KB, application/wps-office.pptx)
2016-05-31 05:09 UTC, raal
Details
backtrace (7.03 KB, text/plain)
2016-06-03 16:55 UTC, raal
Details
backtrace 17-01-11 (6.60 KB, text/plain)
2017-01-11 04:17 UTC, Aron Budea
Details

Note You need to log in before you can comment on or make changes to this bug.
Description raal 2016-05-31 05:09:07 UTC
Created attachment 125403 [details]
test file

Steps:
open file
play presentation (F5)
after the presentation ends, the program crash

Version: 5.3.0.0.alpha0+
Build ID: 33d4f7e5624d77ef1ce51aece1c8a0ef7ea21603
CPU Threads: 1; OS Version: Windows 6.1; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2016-05-29_01:58:16
and 5.1.3.2 (x64)

No problem on Linux. No crash in  4.2.8.2, regression
Comment 1 MM 2016-05-31 18:07:12 UTC
Unconfirmed with v4.4.7.2 under windows 7 x64.
Unconfirmed with v5.1.3.2 under windows 10 x64.
Comment 2 Julien Nabet 2016-05-31 20:26:46 UTC
On pc Debian x86-64 with master sources updated today, I don't reproduce this.
I tested with rendering=gtk3/gtk/kde4/gen
Comment 3 Buovjaga 2016-06-01 10:10:16 UTC
No crash.

Version: 5.2.0.0.beta1 (x64)
Build ID: 1e9933ef611c66bcded94b84052543c78cf1c223
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
Locale: fi-FI (fi_FI)

Win 7 Pro 64-bit, Version: 5.1.3.2 (x64)
Build ID: 644e4637d1d8544fd9f56425bd6cec110e49301b
CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; 
Locale: fi-FI (fi_FI)
Comment 4 raal 2016-06-03 16:55:07 UTC
Created attachment 125471 [details]
backtrace
Comment 5 Timur 2016-06-03 18:37:13 UTC
Also not repro in Win. Please try another computer, graphics..
Comment 6 raal 2016-06-13 15:10:29 UTC
No crash on win10.
Comment 7 Caolán McNamara 2016-09-23 15:07:56 UTC
*** Bug 101584 has been marked as a duplicate of this bug. ***
Comment 8 Caolán McNamara 2016-09-23 15:08:57 UTC
I wonder if this is a VclPtr issue
Comment 9 Caolán McNamara 2016-09-23 15:16:05 UTC
*** Bug 98684 has been marked as a duplicate of this bug. ***
Comment 10 Caolán McNamara 2016-09-23 15:21:07 UTC
*** Bug 96539 has been marked as a duplicate of this bug. ***
Comment 11 Caolán McNamara 2016-09-23 15:22:45 UTC
*** Bug 95015 has been marked as a duplicate of this bug. ***
Comment 12 slingshot 2016-09-23 18:33:39 UTC
(In reply to Caolán McNamara from comment #7)
> *** Bug 101584 has been marked as a duplicate of this bug. ***

Mine was a .odp file but maybe thats irrelevant?
Comment 13 Cor Nouws 2016-09-25 19:26:08 UTC
at least duplicate 96539 was confirmed.. > NEW
Comment 14 Julien Nabet 2016-09-25 21:58:29 UTC
(In reply to Caolán McNamara from comment #8)
> I wonder if this is a VclPtr issue

Focusing on VclPtr part of dispose method, could we simplify (optimize?) mnemonic removal with this:
diff --git a/vcl/source/window/window.cxx b/vcl/source/window/window.cxx
index 2e1b9d5..f0d1405 100644
--- a/vcl/source/window/window.cxx
+++ b/vcl/source/window/window.cxx
@@ -396,9 +396,10 @@ void Window::dispose()
 
     // clear mnemonic labels
     std::vector<VclPtr<FixedText> > aMnemonicLabels(list_mnemonic_labels());
-    for (auto aI = aMnemonicLabels.begin(); aI != aMnemonicLabels.end(); ++aI)
+    for (auto aI = aMnemonicLabels.begin(); aI != aMnemonicLabels.end(); )
     {
-        remove_mnemonic_label(*aI);
+        (*aI)->set_mnemonic_widget(nullptr);
+        aI = aMnemonicLabels.erase(aI);
     }
?
Moreover, does aMnemonicLabels contain a copy of mnemonic_labels? If not, I wonder if the patch could also avoid some iterator mess.
Comment 15 Julien Nabet 2016-09-25 22:03:16 UTC
Sorry Caolán, I thought you were on cc of this one.
Any thoughts about https://bugs.documentfoundation.org/show_bug.cgi?id=100151#c14 I've just added (and which concerns a Vclptr part)?
Comment 16 Caolán McNamara 2016-09-27 09:52:21 UTC
caolanm->julien: I don't think that's where things are going wrong. The "SystemChildWindow" is my suspect, especially because there has to be "something different" under windows vs e.g. linux
Comment 17 Julien Nabet 2016-09-27 17:12:19 UTC
(In reply to Caolán McNamara from comment #16)
> caolanm->julien: I don't think that's where things are going wrong. The
> "SystemChildWindow" is my suspect, especially because there has to be
> "something different" under windows vs e.g. linux

Ok then. Do you think patch I proposed in https://bugs.documentfoundation.org/show_bug.cgi?id=100151#c14 could still be an optim (or I should just ignore because micro optim)?
Comment 18 Caolán McNamara 2016-09-30 14:46:47 UTC
caolanm->julien: #c14 looks reasonable to me as a little optimization so worth submitting to gerrit as such, I guess you could operate directly on mpWindowImpl->m_aMnemonicLabels and avoid the copy too.
Comment 19 Michael Meeks 2016-10-05 16:22:18 UTC
Can't reproduce on Windows 7 with a recent master build in dbgutil mode either with or without OpenGL enabled; I'm using a build from 24e0709fb4a9d7 built ~today.

Has you fixed it Julien ? =)
Comment 20 Julien Nabet 2016-10-05 16:39:34 UTC
(In reply to Michael Meeks from comment #19)
> Can't reproduce on Windows 7 with a recent master build in dbgutil mode
> either with or without OpenGL enabled; I'm using a build from 24e0709fb4a9d7
> built ~today.
> 
> Has you fixed it Julien ? =)

No, I tried to optimize a specific part but it fails on gerrit, so I abandonned it.
Comment 21 Aron Budea 2016-10-06 10:46:09 UTC
Raal bibisected this in:
https://bugs.documentfoundation.org/show_bug.cgi?id=96539#c17
Comment 22 Aron Budea 2016-10-06 11:12:47 UTC
*** Bug 103034 has been marked as a duplicate of this bug. ***
Comment 23 Caolán McNamara 2016-10-06 11:33:51 UTC
fixed bug 91426 seems somewhat similar to this, except that's apparently fixed long ago
Comment 24 V Stuart Foote 2016-10-07 14:47:46 UTC
*** Bug 103034 has been marked as a duplicate of this bug. ***
Comment 25 Peter-Lenné-Schule 2016-10-13 08:52:31 UTC
To uncheck all the graphical components was the solution for us.
Thank you very much.

Only a problem remains. We now need to disable hardware acceleration and edge smoothing for all students. To disable OpenGL I have already found an Admx template.

Do you have to deactivate the hardware acceleration and the edge smoothing likewise an admx template or can you tell us the registry keys?
Comment 26 Xisco Faulí 2016-11-10 09:24:40 UTC
*** Bug 103823 has been marked as a duplicate of this bug. ***
Comment 27 Timur 2016-11-10 15:50:43 UTC
(In reply to Caolán McNamara from comment #10)
> *** Bug 96539 has been marked as a duplicate of this bug. ***
Then, I'll transfer it's duplicates here. It was concluded it's "with Use Hardware Acceleration turned on in a VM or Terminal Services".
Comment 28 Timur 2016-11-10 15:51:13 UTC
*** Bug 96575 has been marked as a duplicate of this bug. ***
Comment 29 Timur 2016-11-10 15:52:03 UTC
*** Bug 96876 has been marked as a duplicate of this bug. ***
Comment 30 Aron Budea 2017-01-07 02:29:58 UTC
Some fresh crash reports from 5.3.0.1:
http://crashreport.libreoffice.org/stats/crash_details/c442a95f-ec43-430a-ade0-3eee3731f0a5
http://crashreport.libreoffice.org/stats/crash_details/a57d8edb-f78e-454d-b2ad-5b9b8a5510d4
http://crashreport.libreoffice.org/stats/crash_details/359661ef-e41f-4a6c-b127-53bf15be129b

One is from the end of a slideshow, the other two happened during previews of slide transitions and custom animations. All were in a remote desktop session.
Comment 31 Aron Budea 2017-01-07 02:46:02 UTC
(In reply to Aron Budea from comment #30)
> One is from the end of a slideshow, the other two happened during previews
> of slide transitions and custom animations. All were in a remote desktop
> session.

An addition note, none of such crashes occurred outside a remote desktop session.
Comment 32 Julien Nabet 2017-01-08 15:37:33 UTC
I noticed weird things in sd/source/ui/slideshow/slideshowimpl.cxx
1)
julien@debian:~/lo/libreoffice$ grep -n mpShowWindow sd/source/ui/slideshow/slideshowimpl.cxx|grep -i clear
688:    mpShowWindow.disposeAndClear();
719:        mpShowWindow.clear();

shouldn't it be always "disposeAndClear" ?

2)
julien@debian:~/lo/libreoffice$ grep -n mpShowWindow sd/source/ui/slideshow/slideshowimpl.cxx|grep Create
776:        mpShowWindow = VclPtr<ShowWindow>::Create( this, ((pParent == nullptr) && mpViewShell) ?  mpParentWindow.get() : pParent );
929:            mpShowWindow = VclPtr<ShowWindow>::Create( this, mpParentWindow );

Should it be mpParentWindow or mpParentWindow.get()?
(see http://opengrok.libreoffice.org/xref/core/sd/source/ui/slideshow/slideshowimpl.cxx)


Michael/Noel: thought you might be interested in this one since could be due to some VclPtr conversion.
Comment 33 Michael Meeks 2017-01-09 10:36:37 UTC
Hi Julian,

Good catch this patch:

commit 1c4025babd7037a3292aa530c7d45ab8d6ef6dcb
    vclwidget: change all vcl::window fields to be wrapped in VclPtr

...
@@ -743,8 +742,7 @@ void SAL_CALL SlideshowImpl::disposing()
 
     if( mpShowWindow )
     {
-        delete mpShowWindow;
-        mpShowWindow = 0;
+        mpShowWindow.clear();
     }

Looks like it doesn't match what we should be doing there. I suspect the lifecycle of that is rather coomplicated - but can you try replacing this with 'disposeAndClear()' to see if that helps in the SlideshowImpl::disposing method ?

Thanks !
Comment 34 Julien Nabet 2017-01-09 11:08:00 UTC
(In reply to Michael Meeks from comment #33)
...
> Looks like it doesn't match what we should be doing there. I suspect the
> lifecycle of that is rather coomplicated - but can you try replacing this
> with 'disposeAndClear()' to see if that helps in the
> SlideshowImpl::disposing method ?
> 
> Thanks !

I'll give it a try after my day-time job but, even if this one would fix the crash, any thought about the second point of my previous comment? ( "mpParentWindow.get()" or  "mpParentWindow")
Comment 35 Michael Meeks 2017-01-09 11:17:41 UTC
I suspect it doesn't hugely matter wrt. .get() or not - but probably both parts of the ? operator need to have a POD type rather than being an in-line C++ class; so - it may be necessary. Without fooling around with it - hard to tell; but I see no potential dangers there.
Comment 36 Commit Notification 2017-01-10 06:10:12 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bce35b8e13b0d82ba54bf3d380f448dad0ee13bb

Related tdf#100151: use disposeAndClear for mpShowWindow (sd)

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 37 Julien Nabet 2017-01-10 06:12:58 UTC
Raal: would it be possible you give it a try with a daily build which includes http://cgit.freedesktop.org/libreoffice/core/commit/?id=bce35b8e13b0d82ba54bf3d380f448dad0ee13bb ?
Comment 38 Aron Budea 2017-01-11 04:17:56 UTC
Created attachment 130311 [details]
backtrace 17-01-11

I tried running a 1-slide presentation with the latest daily build. It still crashes, I'm afraid, attaching backtrace, seems very similar.

Version: 5.4.0.0.alpha0+
Build ID: db4badfc971b9cc60809c3408f579bae04a77c34
CPU Threads: 4; OS Version: Windows 6.1; UI Render: GL; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-01-10_23:25:07
Locale: hu-HU (hu_HU); Calc: CL
Comment 39 Julien Nabet 2017-01-11 06:11:44 UTC
Aron: even if I cherry-picked the commit in 5.3 branch, you're right. So let's remove targets.
Comment 40 Commit Notification 2017-01-11 06:12:15 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=73316bbc92b264f84c68c96460e262a5dd6e0d04&h=libreoffice-5-3

Related tdf#100151: use disposeAndClear for mpShowWindow (sd)

It will be available in 5.3.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 41 Aron Budea 2017-01-21 06:01:01 UTC
Some findings. Debug build throws this info (error) message:
"Window( sd::ShowWindow()) with live children destroyed: class SystemChildWindow()"

So what is the difference between the two executions?

We're in: ::sd::Window::dispose().
Then in vcl::Window::dispose() we get to this line:
xCanvasComponent->dispose();
http://opengrok.libreoffice.org/xref/core/vcl/source/window/window.cxx#162


1) In a local instance, this goes into:
directx9canvaslo.dll!dxcanvas::SpriteCanvas::dispose() Line 118	C++


2) In a remote desktop instance this goes into:
vclcanvaslo.dll!vclcanvas::SpriteCanvas::dispose() Line 125	C++

After 2), only later, a couple of levels into the following call we get into dxcanvas::SpriteCanvas::dispose():
pWrapper->WindowDestroyed( this );
http://opengrok.libreoffice.org/xref/core/vcl/source/window/window.cxx#227

At this point mpRenderModule is empty in the remote desktop connection:
http://opengrok.libreoffice.org/xref/core/canvas/source/directx/dx_spritecanvashelper.cxx#99

Thus DXRenderModule::disposing() is not called where it should be (I assume).
In a local session this is called inside of 1)'s dxcanvas::SpriteCanvas::dispose().
Comment 42 Aron Budea 2017-01-23 03:03:49 UTC
The difference in execution comes from DXRenderModule failing to find DirectX capabilities, most likely DirectX is not working in remote desktop session.

The following lines cause DXRenderModule::verifyDevice( ... ) exiting early, and DX canvas failing to be created:
if(FAILED(mpDirect3D9->GetDeviceCaps(nAdapter,D3DDEVTYPE_HAL,&aCaps)))
    return false;
http://opengrok.libreoffice.org/xref/core/canvas/source/directx/dx_9rm.cxx#653

The problem seems to be that in DXRenderModule::create( ... ) the child window has already been created, but apparently later isn't cleaned up properly.

Proposed a patch to move Direct3D9-related code in the beginning of the function, and exit without creating the window at all in that case:
https://gerrit.libreoffice.org/#/c/33413/

With the patch I'm getting no crash at the end of slideshow, or when previewing slide transitions/animations.
Comment 43 Aron Budea 2017-01-23 18:11:18 UTC
Substituting for commit notification bot... This commit should fix the issue.
Cherry-pick to 5.3 and 5.2 is in the works, at the same time I'd love to hear feedback from remote desktop/terminal server users. Tonight's daily builds should have the fix included.


Aron Budea committed a patch related to this issue.
It has been pushed to "master":

https://cgit.freedesktop.org/libreoffice/core/commit/?id=bb50474225f80b8aeea49f14ad66173462026a41

tdf#100151: Dispose of window if DX device creation failed

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
device creation fails.
Comment 44 Commit Notification 2017-01-25 13:14:15 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1c8f98a5fd33f02b9af0ce35b1ad0eb091c895d0&h=libreoffice-5-2

tdf#100151: Dispose of window if DX device creation failed

It will be available in 5.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 45 Commit Notification 2017-01-25 20:32:04 UTC
Aron Budea committed a patch related to this issue.
It has been pushed to "libreoffice-5-3-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b0676e0ec9a770f83c19390d2d69df57bdb60d24&h=libreoffice-5-3-0

tdf#100151: Dispose of window if DX device creation failed

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 46 Julien Nabet 2017-01-25 20:53:08 UTC
Targets cleanup