Bug 31494 - LibreOffice 3.3 crashes upon exiting on Windows
Summary: LibreOffice 3.3 crashes upon exiting on Windows
Status: CLOSED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
3.3.0 Beta3
Hardware: x86 (IA32) All
: high major
Assignee: Caolán McNamara
URL:
Whiteboard:
Keywords:
: 31703 31735 31752 31795 31848 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-11-09 03:08 UTC by Don't use this account, use tml@iki.fi
Modified: 2014-08-16 20:44 UTC (History)
13 users (show)

See Also:
Crash report or crash signature:


Attachments
possibly part of the solution (30.94 KB, patch)
2010-11-18 02:07 UTC, Caolán McNamara
Details
probably also good to do (1.15 KB, patch)
2010-11-18 02:11 UTC, Caolán McNamara
Details
Picture of bug (9.01 KB, image/png)
2010-11-22 00:49 UTC, Mikeyy - L10n HR
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Don't use this account, use tml@iki.fi 2010-11-09 03:08:46 UTC
In a fresh build as of yesterday (2010-11-08) of the libreoffice-3-3 branch for Windows, --with-distro=NovellWin32, installed on Windows XP SP3, if I start LO and then immediately exit it, it crashes.

I have previously seen the same also in a build using no patches (using no --with-distro option).

The backtrace in Visual Studio looks like this:
 	ntdll.dll!_RtlpCoalesceFreeBlocks@16()  + 0x31f bytes	
 	ntdll.dll!_RtlFreeHeap@12()  + 0x91f bytes	
>	msvcr90.dll!free(void * pBlock=0x04ef9530)  Line 110	C
 	cppu3.dll!`anonymous namespace'::deleteExceptions()  + 0x24 bytes	C++
 	cppu3.dll!`anonymous namespace'::deleteExceptions()  + 0x206 bytes	C++
 	cppu3.dll!_typelib_typedescription_release()  + 0xde bytes	C++
 	cppu3.dll!TypeDescriptor_Init_Impl::~TypeDescriptor_Init_Impl()  + 0xd1 bytes	C++
 	cppu3.dll!_CRT_INIT(void * hDllHandle=0x04284c04, unsigned long dwReason=0x04284bd8, void * lpreserved=0x04284bd8)  Line 449	C
 	cppu3.dll!__DllMainCRTStartup(void * hDllHandle=0x01570000, unsigned long dwReason=0x00000000, void * lpreserved=0x00000000)  Line 560 + 0x8 bytes	C
 	cppu3.dll!_DllMainCRTStartup(void * hDllHandle=0x01570000, unsigned long dwReason=0x00000000, void * lpreserved=0x00000001)  Line 510 + 0xe bytes	C
 	ntdll.dll!_LdrpCallInitRoutine@16()  + 0x14 bytes	
 	ntdll.dll!_LdrShutdownProcess@0()  - 0xfe bytes	
 	kernel32.dll!__ExitProcess@4()  + 0x42 bytes	
 	kernel32.dll!7c81cb26() 	
 	msvcr90.dll!__crtExitProcess(int status=0x00000000)  Line 731 + 0x9 bytes	C
 	msvcr90.dll!doexit(int code=0x00000000, int quick=0x00000000, int retcaller=0x00000000)  Line 632	C
 	msvcr90.dll!exit(int code=0x00000000)  Line 412 + 0xc bytes	C
 	soffice.bin!__tmainCRTStartup()  Line 549	C
 	kernel32.dll!_BaseProcessStart@4()  + 0x23 bytes	

I.e., the crash happens after the call to exit(). There is just one thread left, the "main" one.

Looks like heap corruption to me...

I tried to build with --enable-dbgutil so that it would use the debugging C/C++ runtime msvcrt90d.dll, and eventually got that build to finish (I had to fix a bunch of compilation errors; apparently nobody has built with --enable-dbgutil in a while). But unfortunately I couldn't get that build to even start on XP, and when run from a PKGFORMAT=installed "installation" on my development machine (Windows 7), it crashes during the splash screen, with a totally senseless call stack. Could be related, or not.
Comment 1 Don't use this account, use tml@iki.fi 2010-11-09 03:35:26 UTC
Note that it isn't as such the crash at exit that worries me; if that indeed is the only problem, and we can't find the root cause for it, we could just hack in something so that the process would be more brutally killed and not bother with any cleanups and destructors etc, during the execution of which it is that the crash happens.

But presumably, if this is just a symptom of heap corruption than happens much earlier, that can cause totally random other behaviour, too.
Comment 2 Don't use this account, use tml@iki.fi 2010-11-10 04:35:50 UTC
I ran soffice.bin under the trial version of Purify, and it didn't find any heap corruptions. I don't know how well I can trust that, but anyway, if I actually use the build of LibreOffice a bit and do various operations in Writer or Calc, I don't get any crash during that. So probably the root cause is not a heap corruption, but the UNO type data structures are slightly broken so that the destruction of them at exit then crashes. Or the code just is buggy. But why then no crash on Linux? Is the TypeDescriptor_Init_Impl::~TypeDescriptor_Init_Impl() called at the wrong time on Windows (compared to when it gets called on Linux)?

Anyway, we could always just re-introduce the CPPU_LEAK_STATIC_DATA thing on Windows that used to be there at least in 3.2.1... (I.e. ifdef out most of TypeDescriptor_Init_Impl::~TypeDescriptor_Init_Impl(). It would be interesting to find some information at what stage in the history of OOo that was introduced, and if anybody understood the exact mechanism why it was needed.

http://qa.openoffice.org/issues/show_bug.cgi?id=107490 which concerns removing the CPPU_LEAK_STATIC_DATA doesn't tell why it was introduced in the first place, or what has changed so that it is no longer thought to be necessary.

Of course, presumably the vanilla OOo does not have this crash-on-exit problem on Windows, so it is some of the changes in LibreOffice that has reintroduced the need to do the CPPU_LEAK_STATIC_DATA thing. Unless we can actually fix the root cause for the crash, that is...
Comment 3 Don't use this account, use tml@iki.fi 2010-11-10 08:48:13 UTC
I committed and pushed back the 3.2.1 hack to skip most of TypeDescriptor_Init_Impl::~TypeDescriptor_Init_Impl() but for Windows only this time... Will keep this bug open as that is just a horrible workaround.

(Or, if that is an acceptable thing to do, why not then do it on all platforms? But I guess that it does cause some real trouble in some situations, see OOo issue i#107490.)
Comment 4 Caolán McNamara 2010-11-10 11:50:01 UTC
I have a windows build in progress, if it ever ends, and if its reproducible for me, I'll have a poke at it.
Comment 5 Caolán McNamara 2010-11-17 05:54:05 UTC
I don't like the look of the configmgr flush changes thread that gets spawned off when that .dll gets unloaded.
Comment 6 Caolán McNamara 2010-11-17 06:19:00 UTC
bah, so much for that theory
Comment 7 Caolán McNamara 2010-11-17 13:25:44 UTC
Also don't like the look of globals e.g. "lock" mutex in configmgr being destroyed before unotools releases some configmgr derived reference.
Comment 8 Caolán McNamara 2010-11-18 02:07:08 UTC
Created attachment 40362 [details]
possibly part of the solution
Comment 9 Caolán McNamara 2010-11-18 02:11:09 UTC
Created attachment 40363 [details]
probably also good to do
Comment 10 Caolán McNamara 2010-11-18 02:12:42 UTC
Had to hack out pretty much all threads in order to find what might be the root problems here. Have to try a fresh build to see if it makes a difference in the real world.
Comment 11 Don't use this account, use tml@iki.fi 2010-11-18 03:47:47 UTC
*** Bug 31703 has been marked as a duplicate of this bug. ***
Comment 12 Volker Merschmann 2010-11-18 05:58:23 UTC
Added me to CC

Pls note that all windows users are nagged with this bug on exiting from beta3
Comment 13 Don't use this account, use tml@iki.fi 2010-11-18 09:59:28 UTC
*** Bug 31735 has been marked as a duplicate of this bug. ***
Comment 14 Thorsten Behrens (CIB) 2010-11-18 15:40:50 UTC
Bumping prio, assigning to Caolan - you seem to be working on a fix.
Comment 15 Don't use this account, use tml@iki.fi 2010-11-19 03:29:05 UTC
*** Bug 31752 has been marked as a duplicate of this bug. ***
Comment 16 manj_k 2010-11-19 11:22:46 UTC
CC
Comment 17 Rainer Bielefeld Retired 2010-11-21 00:03:59 UTC
*** Bug 31795 has been marked as a duplicate of this bug. ***
Comment 18 Cesare Leonardi 2010-11-21 10:04:09 UTC
Edited the subject from "Windows XP" to "Windows", since i confirm this also happen under Windows 2000 SP4.

Cesare.
Comment 19 Cesare Leonardi 2010-11-21 10:49:55 UTC
I've noted another thing: if the quickstart icon is present, the crash on exit doesn't happen (tryed with writer, calc and impress). But it happens if i try to close the quickstart icon.

The error is always the same:
-----
soffice.bin - Application error

The instruction at "0x784ad989" reference memory at "0xffffffff". The memory
could not be "read". Click OK to terminate the program.
-----

One time i've seen the same error message, but with this title:
DDE Server Windows: soffice.bin

Cesare.
Comment 20 Don't use this account, use tml@iki.fi 2010-11-21 23:50:46 UTC
If the "QuickStarter" is turned on (its icon is present), OpenOffice.org / LibreOffice is still running and never exits, so what you say is completely expected. (In my humble opinion, the whole "QuickStarter" thing is misleading marketing. It would be more honest to just tell people to not close their OOo/LO window if they want it to be available quickly.)

The mention of "DDE" in some cases in the error message window is a red herring, and can be ignored.
Comment 21 Mikeyy - L10n HR 2010-11-22 00:47:48 UTC
+1, bug appears on XP SP3 2600 system after upgrade from OO 3.2.1. to LO 3.3 beta 3
Comment 22 Mikeyy - L10n HR 2010-11-22 00:49:29 UTC
Created attachment 40471 [details]
Picture of bug
Comment 23 Rainer Bielefeld Retired 2010-11-22 22:28:36 UTC
Correct Version
Comment 24 Rainer Bielefeld Retired 2010-11-22 22:29:03 UTC
*** Bug 31848 has been marked as a duplicate of this bug. ***
Comment 25 VETP 2010-11-23 00:10:23 UTC
For me since this new installation (upgrade beta 2 to beta 3 I have this error on exit :

L'instruction à "0x7c92019" emploie l'adresse mémoire "0xffffffff", la mémoire ne peut pas être "read"
Comment 26 Don't use this account, use tml@iki.fi 2010-11-23 09:05:12 UTC
This is not really blocking the 3.3 release as the workaround is in place in the libreoffice-3-3 branch.
Comment 27 Florian Effenberger 2010-11-25 03:05:48 UTC
Maybe is linked to bug 31096?
Comment 28 Don't use this account, use tml@iki.fi 2010-11-25 03:14:25 UTC
Florian: Don't think so, as the MacOSX stack traces in that other bug seem to talk about clipboard related things, and also the crash there is happening while main() is still running. This bug happens during exit() processing, and/or after main() has returned.

But I might be wrong of course.
Comment 29 Caolán McNamara 2010-11-25 12:53:14 UTC
Seems good, fixed with http://cgit.freedesktop.org/libreoffice/libs-core/commit/?h=libreoffice-3-3&id=f10c27040a40cbc9e0b97258c1a0edc5f721d37f

Definitely shuts down here without an error now with a debugging windows build with that in place.

We can leave the hackaround in place for 3.3, shouldn't hurt I think. And what I believe is the core problem is fixed there as well.
Comment 30 Thorsten Behrens (CIB) 2010-11-26 13:44:38 UTC
Just to have that noted here for further dupes - seeing the crash now on mac, too (beta3 build). Happened on shutdown after loading all docs from bug 31734 .