this deadlock just happened in CppunitTest_libreofficekit_tiledrendering: #0 0x00007fa3811fb09d in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007fa3811f3e23 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x00007fa3822f5a16 in osl_acquireMutex(oslMutexImpl*) (pMutex=0x2d32f40) at /work/lo/master/sal/osl/unx/mutex.cxx:97 #3 0x00007fa37764b065 in osl::Mutex::acquire() (this=0x7fa36948f780 <rtl::Static<osl::Mutex, GrammarCheckingIterator::MyMutex>::get()::instance>) at /work/lo/master/include/osl/mutex.hxx:56 #4 0x00007fa3776629c2 in osl::ClearableGuard<osl::Mutex>::ClearableGuard(osl::Mutex&) (this=0x7fffcfc0df18, t=...) at /work/lo/master/include/osl/mutex.hxx:163 #5 0x00007fa3776610c5 in comphelper::OInterfaceContainerHelper2::disposeAndClear(com::sun::star::lang::EventObject const&) (this=0x7fa357bb3110, rEvt=...) at /work/lo/master/comphelper/source/container/interfacecontainer2.cxx:260 #6 0x00007fa369197174 in GrammarCheckingIterator::dispose() (this=0x7fa357bb2f50) at /work/lo/master/linguistic/source/gciterator.cxx:907 #7 0x00007fa351bd7191 in (anonymous namespace)::doDispose(com::sun::star::uno::Reference<com::sun::star::linguistic2::XProofreadingIterator> const&) (inst=uno::Reference to (GrammarCheckingIterator *) 0x7fa357bb2f78) at /work/lo/master/sw/source/core/bastyp/proofreadingiterator.cxx:34 #8 0x00007fa351bd7327 in sw::proofreadingiterator::dispose() () at /work/lo/master/sw/source/core/bastyp/proofreadingiterator.cxx:66 #9 0x00007fa351bd1d3e in FinitCore() () at /work/lo/master/sw/source/core/bastyp/init.cxx:672 #10 0x00007fa35284e394 in SwDLL::~SwDLL() (this=0x2b9b7e0, __in_chrg=<optimized out>) at /work/lo/master/sw/source/uibase/app/swdll.cxx:163 #11 0x00007fa35284f634 in std::default_delete<SwDLL>::operator()(SwDLL*) const (this=0x7fa353adea98 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance+8>, __ptr=0x2b9b7e0) at /usr/include/c++/7/bits/unique_ptr.h:78 #12 0x00007fa35284f8e9 in std::unique_ptr<SwDLL, std::default_delete<SwDLL> >::reset(SwDLL*) (this=0x7fa353adea98 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance+8>, __p=0x2b9b7e0) at /usr/include/c++/7/bits/unique_ptr.h:376 #13 0x00007fa35284f3a1 in comphelper::unique_disposing_ptr<SwDLL>::reset(SwDLL*) (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, p=0x0) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:42 #14 0x00007fa35284ec7e in comphelper::unique_disposing_solar_mutex_reset_ptr<SwDLL>::reset(SwDLL*) (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, p=0x0) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:171 #15 0x00007fa35284e5fa in comphelper::unique_disposing_solar_mutex_reset_ptr<SwDLL>::~unique_disposing_solar_mutex_reset_ptr() (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, __in_chrg=<optimized out>) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:177 #16 0x00007fa35284e460 in (anonymous namespace)::SwDLLInstance::~SwDLLInstance() (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, __in_chrg=<optimized out>) at /work/lo/master/sw/source/uibase/app/swdll.cxx:57 #17 0x00007fa381647c68 in __run_exit_handlers () at /lib64/libc.so.6 #18 0x00007fa381647cba in () at /lib64/libc.so.6 #19 0x00007fa38162d511 in __libc_start_main () at /lib64/libc.so.6 #20 0x000000000040a1ba in _start () actually it's not really a deadlock - the mutex being locked is a global variable that has already been destroyed - usually this only prints a warning like "pthread_mutex_lock failed: Invalid argument" but apparently sometimes the freed memory looks like a valid mutex that is already locked. this is a regression from: commit a99707d2c4f65a6a5fe160ce2b614aca273f0d2d Author: Caolán McNamara <caolanm@redhat.com> AuthorDate: Mon May 8 15:26:22 2017 +0100 Resolves: rhbz#144437 make gnome-documents not crash the whole time accept that once initted that LibreOffice cannot be deinitted and reinited (without lots of work), but allow the main loop to quit and restart so LOKs thread can run and exit successfully, new LOK connections will restart the m ain loop. The buckets of global state continues to be valid the whole time this way ... which turned framework::Desktop::terminate() into a no-op when called from LOKit. this is clearly the wrong solution to the problem - now there is no way for a LibreOfficeKit client to actually destroy all the thousands of global variables, so unless the client uses _exit(), they will sometimes crash or deadlock during shutdown. (i would not be opposed to requiring LOKit clients to use _exit(), but that should be documented somewhere.) really the problem is that in the LOKDocView the "static void lok_doc_view_destroy (GtkWidget* widget)" calls "priv->m_pOffice->pClass->destroy (priv->m_pOffice)" - but this must be called only once per process, when the process shuts down, not when just some view is destroyed. because of this i'll disable the unit test on master now.
Michael Stahl committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=4f05fdffbe4483ae0a466a6460b63560c3fb45ca tdf#113311 disable CppunitTest_libreofficekit_tiledrendering for now It will be available in 6.0.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
feel free to find a working solution
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
the situation is the same as it was when the bug was filed
I wonder if this is still a problem. It is a bit scary that that unit test has been disabled for so long. Wonder if affected stakeholders even have been aware, I certainly wasn't. Will investigate a bit.
Michael, any memory of how often those deadlocks occurred?
i have no memory of that but the description indicates it wasn't actually a deadlock but a use-after-free locking of a global pthread_mutex - how often that happens to cause a deadlock is anybody's guess; perhaps try implementing the suggestion to call _exit() and avoid global dtors that way :)
Dear Michael Stahl (CIB), To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Dear Michael Stahl (allotropia), To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug