Bug 113311 - lifecycle of LibreOfficeKit as used by LOKDocView is broken by design
Summary: lifecycle of LibreOfficeKit as used by LOKDocView is broken by design
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.0.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:6.0.0
Keywords:
Depends on:
Blocks: LOKDocView
  Show dependency treegraph
 
Reported: 2017-10-20 20:13 UTC by Michael Stahl (allotropia)
Modified: 2022-09-15 07:16 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Stahl (allotropia) 2017-10-20 20:13:33 UTC
this deadlock just happened in CppunitTest_libreofficekit_tiledrendering:

#0  0x00007fa3811fb09d in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa3811f3e23 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x00007fa3822f5a16 in osl_acquireMutex(oslMutexImpl*) (pMutex=0x2d32f40) at /work/lo/master/sal/osl/unx/mutex.cxx:97
#3  0x00007fa37764b065 in osl::Mutex::acquire() (this=0x7fa36948f780 <rtl::Static<osl::Mutex, GrammarCheckingIterator::MyMutex>::get()::instance>) at /work/lo/master/include/osl/mutex.hxx:56
#4  0x00007fa3776629c2 in osl::ClearableGuard<osl::Mutex>::ClearableGuard(osl::Mutex&) (this=0x7fffcfc0df18, t=...) at /work/lo/master/include/osl/mutex.hxx:163
#5  0x00007fa3776610c5 in comphelper::OInterfaceContainerHelper2::disposeAndClear(com::sun::star::lang::EventObject const&) (this=0x7fa357bb3110, rEvt=...) at /work/lo/master/comphelper/source/container/interfacecontainer2.cxx:260
#6  0x00007fa369197174 in GrammarCheckingIterator::dispose() (this=0x7fa357bb2f50) at /work/lo/master/linguistic/source/gciterator.cxx:907
#7  0x00007fa351bd7191 in (anonymous namespace)::doDispose(com::sun::star::uno::Reference<com::sun::star::linguistic2::XProofreadingIterator> const&) (inst=uno::Reference to (GrammarCheckingIterator *) 0x7fa357bb2f78) at /work/lo/master/sw/source/core/bastyp/proofreadingiterator.cxx:34
#8  0x00007fa351bd7327 in sw::proofreadingiterator::dispose() () at /work/lo/master/sw/source/core/bastyp/proofreadingiterator.cxx:66
#9  0x00007fa351bd1d3e in FinitCore() () at /work/lo/master/sw/source/core/bastyp/init.cxx:672
#10 0x00007fa35284e394 in SwDLL::~SwDLL() (this=0x2b9b7e0, __in_chrg=<optimized out>) at /work/lo/master/sw/source/uibase/app/swdll.cxx:163
#11 0x00007fa35284f634 in std::default_delete<SwDLL>::operator()(SwDLL*) const (this=0x7fa353adea98 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance+8>, __ptr=0x2b9b7e0) at /usr/include/c++/7/bits/unique_ptr.h:78
#12 0x00007fa35284f8e9 in std::unique_ptr<SwDLL, std::default_delete<SwDLL> >::reset(SwDLL*) (this=0x7fa353adea98 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance+8>, __p=0x2b9b7e0) at /usr/include/c++/7/bits/unique_ptr.h:376
#13 0x00007fa35284f3a1 in comphelper::unique_disposing_ptr<SwDLL>::reset(SwDLL*) (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, p=0x0) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:42
#14 0x00007fa35284ec7e in comphelper::unique_disposing_solar_mutex_reset_ptr<SwDLL>::reset(SwDLL*) (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, p=0x0) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:171
#15 0x00007fa35284e5fa in comphelper::unique_disposing_solar_mutex_reset_ptr<SwDLL>::~unique_disposing_solar_mutex_reset_ptr() (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, __in_chrg=<optimized out>) at /work/lo/master/include/comphelper/unique_disposing_ptr.hxx:177
#16 0x00007fa35284e460 in (anonymous namespace)::SwDLLInstance::~SwDLLInstance() (this=0x7fa353adea90 <rtl::Static<(anonymous namespace)::SwDLLInstance, (anonymous namespace)::theSwDLLInstance>::get()::instance>, __in_chrg=<optimized out>) at /work/lo/master/sw/source/uibase/app/swdll.cxx:57
#17 0x00007fa381647c68 in __run_exit_handlers () at /lib64/libc.so.6
#18 0x00007fa381647cba in  () at /lib64/libc.so.6
#19 0x00007fa38162d511 in __libc_start_main () at /lib64/libc.so.6
#20 0x000000000040a1ba in _start ()

actually it's not really a deadlock - the mutex being locked
is a global variable that has already been destroyed - usually
this only prints a warning like
"pthread_mutex_lock failed: Invalid argument"
but apparently sometimes the freed memory looks like a valid
mutex that is already locked.

this is a regression from:

commit a99707d2c4f65a6a5fe160ce2b614aca273f0d2d
Author:     Caolán McNamara <caolanm@redhat.com>
AuthorDate: Mon May 8 15:26:22 2017 +0100

    Resolves: rhbz#144437 make gnome-documents not crash the whole time
    
    accept that once initted that LibreOffice cannot be deinitted and reinited
    (without lots of work), but allow the main loop to quit and restart so LOKs
    thread can run and exit successfully, new LOK connections will restart the m
ain
    loop.
    
    The buckets of global state continues to be valid the whole time this way


... which turned framework::Desktop::terminate() into a no-op when
called from LOKit.

this is clearly the wrong solution to the problem - now there is no
way for a LibreOfficeKit client to actually destroy all the thousands
of global variables, so unless the client uses _exit(), they
will sometimes crash or deadlock during shutdown.

(i would not be opposed to requiring LOKit clients to use _exit(),
but that should be documented somewhere.)

really the problem is that in the LOKDocView the
"static void lok_doc_view_destroy (GtkWidget* widget)"
calls "priv->m_pOffice->pClass->destroy (priv->m_pOffice)"
- but this must be called only once per process, when
the process shuts down, not when just some view is destroyed.

because of this i'll disable the unit test on master now.
Comment 1 Commit Notification 2017-10-20 20:18:33 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4f05fdffbe4483ae0a466a6460b63560c3fb45ca

tdf#113311 disable CppunitTest_libreofficekit_tiledrendering for now

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 2 Caolán McNamara 2017-10-23 10:09:00 UTC
feel free to find a working solution
Comment 3 QA Administrators 2019-03-30 06:12:57 UTC Comment hidden (obsolete)
Comment 4 Michael Stahl (allotropia) 2019-04-01 10:22:24 UTC
the situation is the same as it was when the bug was filed
Comment 5 How can I remove my account? 2020-09-04 08:13:03 UTC
I wonder if this is still a problem. It is a bit scary that that unit test has been disabled for so long. Wonder if affected stakeholders even have been aware, I certainly wasn't. Will investigate a bit.
Comment 6 How can I remove my account? 2020-09-04 08:24:10 UTC
Michael, any memory of how often those deadlocks occurred?
Comment 7 Michael Stahl (allotropia) 2020-09-14 10:03:49 UTC
i have no memory of that but the description indicates it wasn't actually a deadlock but a use-after-free locking of a global pthread_mutex - how often that happens to cause a deadlock is anybody's guess; perhaps try implementing the suggestion to call _exit() and avoid global dtors that way :)
Comment 8 QA Administrators 2022-09-15 03:46:10 UTC
Dear Michael Stahl (CIB),

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug