| Summary: | segfault in osl_acquireMutex | ||
|---|---|---|---|
| Product: | LibreOffice | Reporter: | Terrence Enger <lo_bugs> |
| Component: | LibreOffice | Assignee: | Not Assigned <libreoffice-bugs> |
| Status: | VERIFIED FIXED | ||
| Severity: | normal | CC: | jmadero.dev, lionel, sberg.fun, serval2412 |
| Priority: | medium | ||
| Version: | 4.2.0.0.alpha0+ Master | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | target:4.4.0 | ||
| Crash report or crash signature: | Regression By: | ||
| Bug Depends on: | 80205 | ||
| Bug Blocks: | |||
| Attachments: |
gdb on the core file
typescript: SIGSEGV under valgrind |
||
|
Description
Terrence Enger
2013-09-06 13:46:54 UTC
Lionel, Is this perhaps a dup of bug 55566 "open two odb files with macro at "open document" event -> crash" ? Thanks, Terry. (In reply to comment #1) > Is this perhaps a dup of bug 55566 "open two odb files with macro at "open > document" event -> crash" ? I don't think so. In this bug, osl_acquireMutex is called on an uninitialised value: the value 0x99999999 is a special canary for unitialised memory in debug builds on GNU libc (GNU/Linux). The problem is probably linked to how initialisation of static values (and calling the C++ constructor) happens in multi-threaded, multi-dynamic library code, which I'm not very clear on how it happens. The code looks like: static ItemHolder1* pHolder = new ItemHolder1(); pHolder->impl_addItem(eItem); The constructor of ItemHolder1 should call the Mutex constructor which should initialise the m_aLock Mutex which impl_addItem tries to lock (via ClearableGuard); it seems that in this specific case m_aLock is allocated but not initialised... That's the local problem. But maybe (not sure) the local problem is triggered by a higher-level problem, which is that connectivity::calc::OCalcConnection::disposing tries to dispose of its connection to Calc, but Calc is already dead? I say that because I see in the backtrace: SfxApplication::GetOrCreate SfxObjectShell::Close GetOrCreate goes and *creates* the SfxApplication, which is ... unexpected when called from a destructor / close method: what Close does is get the (freshly created) SfxApplicaton and erase all its ObjectShells... It seems pointless to create it just for that. Which raises the question of order of destruction of LibreOffice components during application shutdown... How do we handle dependencies between "major" components (Calc, Base, ...) in there? The question is probably complexified by circular dependencies between Calc and Base (connectivity and/or dbaccess module): Calc can use Base datasources, but Base can use Calc as a datasource :) Stephan, do you think (one of) these questions would fall under your area of expertise? If not, any idea who to consult? (In reply to comment #2) > That's the local problem. But maybe (not sure) the local problem is > triggered by a higher-level problem, which is that > connectivity::calc::OCalcConnection::disposing tries to dispose of its > connection to Calc, but Calc is already dead? I say that because I see in > the backtrace: > > SfxApplication::GetOrCreate > SfxObjectShell::Close > > GetOrCreate goes and *creates* the SfxApplication, which is ... unexpected > when called from a destructor / close method: what Close does is get the > (freshly created) SfxApplicaton and erase all its ObjectShells... It seems > pointless to create it just for that. Yes, the root problem here apparently is that SfxApplication is re-created during shutdown, while the UNO service manager is being disposed and in turn disposes all registered services. Those UNO services must generally do as little as possible during disposing, as infrastructure they depend on during normal operation may already have been taken down during shutdown. I have no idea what to do in this particular case, but it is ultimately dbaccess::ODatabaseContext::disposing (frame 36) that is doing "too much." Created attachment 85628 [details]
typescript: SIGSEGV under valgrind
Attaching this just in case it is interesting.
Some high points ...
(*) ODatabaseContext::disposing appears several times
(*) line 1598: invalid read in com::sun::star::uno::BaseReference::is;
ODatabaseContext::disposing is in the stack
(*) line 1650: segfault in com::sun::star::uno::BaseReference::is();
ODatabaseContext::disposing is in the stack
On pc Debian x86-64 with master sources updated today, I tried to reproduce this but no crash. However, I noticed these traces: warn:legacy.osl:13046:1:sw/source/ui/dbui/dbmgr.cxx:1629: Exception in SwDBMgr::GetColumnSupplier warn:fwk:13046:1:framework/source/fwi/threadhelp/transactionmanager.cxx:312: TransactionManager...: Owner instance already closed. Call was rejected! Should this be closed as WFM or NEW? Nobody else has seen the segfault, and I cannot see it now because bug 80205 "assertion in OUString::operator[] at ustring.hxx:421" happens as I try step (4) of this bug. So, as little as we like to have a bug UNCONFIRMED for a long time, I think that UNCONFIRMED is the truest description of the situation. I encountered this bug as I was trying to confirm bug 68912 "EDITING: Label-wizard - Next Recored doesn't work". Now, I have found another bug while trying out this one again. There seems to be a pattern here. On pc Debian x86-64 with master sources updated yesterday (I've got enable-dbg), I still don't reproduce this :-( Idem of my comment5, I noticed these: warn:legacy.osl:7131:1:sw/source/uibase/dbui/dbmgr.cxx:1631: Exception in SwDBManager::GetColumnSupplier warn:fwk:7131:1:framework/source/fwi/threadhelp/transactionmanager.cxx:274: TransactionManager...: Owner instance already closed. Call was rejected! warn:tools.debug:7131:1:tools/source/debug/debug.cxx:297: no DbgTestSolarMutex function set Following the Maxim's fix for fdo#80205, could you give it a new try with master sources updated from yesterday at minimum? Also, do you have accessibility enabled? I'm asking because some bugs are triggered by this part. If you still reproduce this, a bt would be useful. With master commit dc795cb, fetched 2014-06-20 1642 UTC, there is no crash. Thank you, Julien, for your attention. Norbert Thiebaud committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=01a882039ec4d0edf4da7d3e10ffea569a3e4aee fdo#69036 do not try to create a sfxApplication when we are tearing-down The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. With master commit 924a28a, fetched 2014-06-28 0333 UTC, the crash is gone. Thank you, Norbert. |