Termination of an binaryurp bridge (eg. because the remote process crashes) can cause DisposedException in a different bridge. Expectation is that different bridges are not affected by termination of other bridges. We had 3 processes communicating to each other using binaryurp bridges: Process A <-> process B Process B <-> process C Process A and process C don't talk to each other. Process A requests process B to execute a method. As part of this method, process B needs to call process C: A: call doSomethingInProcessB(), waiting for result B: execute doSomethingInProcessB(), will now call doSomethingInProcessC() C: idling Process A is now terminated (due to one of its threads crashing, a kill, ...). Process B notices that the bridge to process A has terminated and calls ThreadPool::dispose(nDisposeId). ThreadPool::dispose(..) walks through all JobQueues, calling JobQueue::dispose(nDisposeId). Since the doSomethingInProcessB()-call is still being processed, the associated JobQueue contains the nDisposeId as topmost entry in the callstack. JobQueue::dispose(..) finds the disposeId and sets it to 0. It signals m_cndWait so that the bridge can terminate (jobqueue.cxx:143) Concurrently to this, the worker thread currently working on doSomethingInProcessB() wants to call doSomethingInProcessC(). The IPC is sent out and JobQueue::enter(..) is called to wait for the result. JobQueue::enter(..) puts a different disposeId onto the callstack (since the call uses a different bridge) and should block on m_cndWait.wait() to wait for the result (jobqueue.cxx:73) But m_cndWait has been signalled by JobQueue::dispose(), so JobQueue::enter(..) doesn't block - but m_lstJobs is still empty (jobqueue.cxx:98). It resets the m_cndWait and returns a nullptr, which is converted into a DisposedException by Bridge::makeCall() (bridge.cxx:610) - even though the bridge to process C is completely intact at this point in time. I'd suggest the following fixes: * JobQueue::enter() should check for job == nullptr after resetting m_cndWait in jobqueue.cxx:98. If so, it should continue waiting instead of returning nullptr. This will avoid the DisposedException, the call to doSomethingInProcessC() will work correctly. * JobQueue::enter() should check for m_lstCallstack == 0 and m_lstJob.empty() after processing a request (jobqueue.cxx:109). This will ensure that the bridge will correctly terminate once doSomethingInProcessB() has finished.
Hello! Do you still want to implement your proposal? If yes, please write to the IRC chat #libreoffice-dev Also, you can try to make patch by yourself using information from this page https://wiki.documentfoundation.org/Development/GetInvolved
Dear Marc-Oliver Straub, This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INSUFFICIENTDATA due to lack of needed information. For more information about our NEEDINFO policy please read the wiki located here: https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Status/NEEDINFO If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed. Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-NeedInfo-Ping
Yes, I plan to provide a fix for this issue soon.
Dear Marc-Oliver Straub, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug