Description: SalUserEventList::isFrameAlive hang after crash Steps to Reproduce: 1. Open attachment 142401 [details] from bug 117896 2. Open Writer 3. Indent 4. Undo 5. Redo -> Crash (2-5 working starting from 6.3 6. Crash notifier.. Press OK -> Process still alive consuming 100% CPU Can't reproduce it in the bibisect repro.. only the installed version.. Actual Results: Process being alive Expected Results: Quite Reproducible: Always User Profile Reset: Yes Additional Info: Version: 7.1.0.0.alpha0+ (x64) Build ID: 191288d6a7fb52b31038a21c4e71ee57ffa3bacd CPU-threads: 4; Besturingssysteem: Windows 6.3 Build 9600; UI-render: Skia/Rooster; VCL: win Locale: nl-NL (nl_NL); GI: nl-NL Calc: CL
Created attachment 161741 [details] Very sleepy profile stack
I can actually reproduce it in bibisect build.. Accept the OK for the crash notifier with enter.. sometimes you need to do it a two or three rounds before the problem occurs
(In reply to Telesto from comment #0) > Description: > SalUserEventList::isFrameAlive hang after crash > > Steps to Reproduce: > 1. Open attachment 142401 [details] from bug 117896 Mentioned file is a xlsx file. Please, indicate which file needs to be used
*** Bug 131677 has been marked as a duplicate of this bug. ***
(In reply to Xisco Faulí from comment #3) > (In reply to Telesto from comment #0) > > Description: > > SalUserEventList::isFrameAlive hang after crash > > > > Steps to Reproduce: > > 1. Open attachment 142401 [details] from bug 117896 > > Mentioned file is a xlsx file. Please, indicate which file needs to be used Hmm.. lets keep it more basic. 1. Open Writer document 2. Open Another writer document 3. Indent 4. Undo 5. Redo -> Crash - (expected) 6. Process still alive -> SalUserEventList::isFrameAlive consuming 19% CPU.. needs be killed from a task manager
Created attachment 161816 [details] Bibisect attempt (wrong) Bad is certain.. good not so.. took the wrong turn somewhere. must be before 29 March based on my previous report (duplicate)
For me it just opens LibreOffice again after the crash. Do you mean you see SalUserEventList::isFrameAlive under the soffice.bin process somehow? Tested with and without Skia. Version: 7.1.0.0.alpha0+ (x64) Build ID: 7f6d7a0eb624d67421cd5af6462ee2a662fdff3a CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: default; VCL: win Locale: fi-FI (fi_FI); UI: en-US Calc: threaded
(In reply to Buovjaga from comment #7) > For me it just opens LibreOffice again after the crash. Do you mean you see > SalUserEventList::isFrameAlive under the soffice.bin process somehow? > > Tested with and without Skia. > > Version: 7.1.0.0.alpha0+ (x64) > Build ID: 7f6d7a0eb624d67421cd5af6462ee2a662fdff3a > CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: default; VCL: win > Locale: fi-FI (fi_FI); UI: en-US > Calc: threaded Did you try one or two rounds? It does not always happen.. SalUserEventList::isFrameAlive is the only process alive.. looping.. Very Sleepy CS (quite a nice tool.. to get some insight what the LibreOffice doing.. where it is spending time.. or if a hang bug is same or not.. ) SalUserEventList::isFrameAlive SalBitmap::~SalBitmap Bitmap::~Bitmap Image::ImplInit Image::~Image com_sun_star_form_OTimeModel_get_implementation unit_lok_process_events_to_idle execute_onexit_table Somehow a process is executed on exit..
(In reply to Telesto from comment #8) > Did you try one or two rounds? It does not always happen.. Yes, several. > SalUserEventList::isFrameAlive is the only process alive.. looping.. Very > Sleepy CS (quite a nice tool.. to get some insight what the LibreOffice > doing.. where it is spending time.. or if a hang bug is same or not.. ) Does this imply step 0 is using Very Sleepy? You can't see this in Windows Task Manager (ie. a hanging soffice.bin)? Is the behaviour different for you and me in that LibreOffice does not restart for you after the recovery window?
(In reply to Buovjaga from comment #9) > (In reply to Telesto from comment #8) > > Did you try one or two rounds? It does not always happen.. > > Yes, several. > > > SalUserEventList::isFrameAlive is the only process alive.. looping.. Very > > Sleepy CS (quite a nice tool.. to get some insight what the LibreOffice > > doing.. where it is spending time.. or if a hang bug is same or not.. ) > > Does this imply step 0 is using Very Sleepy? You can't see this in Windows > Task Manager (ie. a hanging soffice.bin)? > > Is the behaviour different for you and me in that LibreOffice does not > restart for you after the recovery window? soffice.bin is hanging (but still busy at around 25%). There is only one thread left SalUserEventList::isFrameAlive. All other thread are internal processes are death already. There is no restart.. no LibreOffice window only a process running.. which needs to be killed manually.. If you attempt to launch LibreOffice again.. you get boot/launch loop. Splash screen, splash screen, splash screen .. until to older process is killed.. A bibisect is awfully hard. But of timing aspect, I assume.
See also bug 134674
@Xisco Are you able to reproduce this (I'm not the only one: bug 134674) 1. Open Writer document 2. Open Another writer document 3. Indent 4. Undo 5. Redo -> Crash - (expected) 6. Process still alive -> SalUserEventList::isFrameAlive consuming 19% CPU.. needs be killed from a task manager Sometimes it takes 2-3 rounds of crashing before the issue starts
Created attachment 162843 [details] BT with symbols VTune Profiler based on x39 build
Created attachment 162844 [details] Flush data-stream BT
(In reply to Telesto from comment #12) > @Xisco > Are you able to reproduce this (I'm not the only one: bug 134674) > > 1. Open Writer document > 2. Open Another writer document > 3. Indent > 4. Undo > 5. Redo -> Crash - (expected) > 6. Process still alive -> SalUserEventList::isFrameAlive consuming 19% CPU.. > needs be killed from a task manager > > Sometimes it takes 2-3 rounds of crashing before the issue starts Not reproducible with the bisect repository nor a local build Version: 7.1.0.0.alpha0+ Build ID: 358674b87b8d9cd78079fb105aa81b50f4b5029b CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Calc: threaded
(In reply to Xisco Faulí from comment #15) I forgot to add, Windows only
*** Bug 132219 has been marked as a duplicate of this bug. ***
Created attachment 163711 [details] Bibisect log Following range 5dca5309207b6b3cd5bed68da47223e08a3ac3f8 is the first bad commit commit 5dca5309207b6b3cd5bed68da47223e08a3ac3f8 Author: Norbert Thiebaud <nthiebaud@gmail.com> Date: Tue Mar 17 16:44:48 2020 -0700 source 6a05f8810684024303047ac9105be4ff5ae8c536 source 6a05f8810684024303047ac9105be4ff5ae8c536 source 4e755e622f2d782d657626b6234fb3acd3d08e15 source 7af11e2e051eedd790e0ed8c8ac0e1e667c1001b source d1a4f95def7d65165a992613784564c02b1c76bb source 9323307d675b71c501534ee98872a2f00b465bc2 source d08d5c1857482cb3789ed2896921abeb83f4d217 source 8d9d6d43a9d895eb781a7fb7f47b7e4342883829 source 0c225c7c2b47d7ec57ab7f3f2a900aaac78031d0 source b389b5958787b142a42d95744f46ccc9b94cf0a9 source 7520e2b2126c05aadb738256313d2f250b9ded62 source 24973523ba59087185d434396fd614e73d72107f
Created attachment 163712 [details] Bibisect log 1. Open the attached file 2. CTRL+A 3. CTRL+C 4. CTRL+N 5. CTRL+V 6. CTRL+Z -> Crash Mostly 2 runs are needed
@Roman You did encounter a hang after crash too? IIRC
(In reply to Telesto from comment #20) > @Roman > You did encounter a hang after crash too? IIRC I saw a hang after crash in my own crash bug report (Skia problem for Fade slide transition effect) I can test your example later
Download 1. https://wiki.documentfoundation.org/images/f/f3/GS50-GettingStartedLO.odt 2. Edit -> Track changes -> Compare documents -> Open the attached file (total different) 3. LibreOffice stops functioning. No termination; debugger still running.
Created attachment 163876 [details] second file for comparison crash
@Jan-Marek.. The deadlock situation you're working on: Unlock scheduler in deinit for ProcessEventsToIdle bring this bug (with duplicates) (and more distant bug 135073 bug 131681) For me lingering soffice.bin after crash only occurs on second and following crashes.. 1. open a bug doc able to crash LibO 2. Let it crash -> file recovery dialog and such appears (everything OK) 3. Make it crash again with same bug doc 4. Now we see a lingering soffice.bin at 25% CPU.
(In reply to Telesto from comment #24) > @Jan-Marek.. > The deadlock situation you're working on: Unlock scheduler in deinit for > ProcessEventsToIdle bring this bug (with duplicates) (and more distant bug > 135073 bug 131681) That Scheduler code Mike and I are working on is there for years: commit fd0fff67798fea87217e65bb1561aa0d0e741c51 Author: Jan-Marek Glogowski <glogow@fbihome.de> Date: Fri Jul 28 17:13:20 2017 +0200 Assert active Tasks on scheduler de-init This ProcessEventsToIdle() is just enabled for a debug build, as this is just some debug facility to verify the active static Tasks list. Normally - at the point of Scheduler::ImplDeInitScheduler - no more Tasks will be processed, but the list of pending Tasks can be quite long (which may be a problem in itself). > For me lingering soffice.bin after crash only occurs on second and following > crashes.. > > 1. open a bug doc able to crash LibO > 2. Let it crash -> file recovery dialog and such appears (everything OK) > 3. Make it crash again with same bug doc > 4. Now we see a lingering soffice.bin at 25% CPU. This whole bug report doesn't make sense. If LO really crashes, the process will be gone, so there is no way to "hang" in SalUserEventList::isFrameAlive. Maybe the LO process is not crashed but somehow deadlocked to begin with? No idea, how the watchdog is handling this.
(In reply to Jan-Marek Glogowski from comment #25) > This whole bug report doesn't make sense. If LO really crashes, the process > will be gone, so there is no way to "hang" in > SalUserEventList::isFrameAlive. Maybe the LO process is not crashed but > somehow deadlocked to begin with? No idea, how the watchdog is handling this. First of, I misconceived where working on. Clearly a different topic. Me thinking I know something about the topic, while not knowing anything at all :P Mea culpa ---About topic here The issue is more me lacking the adequate terminology/vocabulary. Take a STR which normally would crash/terminate LibreOffice. Directly without 'grace' or with intermediary dialog show up, LibreOffice crashed storing recovery information now.. Pressing OK should finally kill/terminate soffice.bin process. However In my case the 'termination' doesn't kick in. It works for the first round using known crasher, but second 'crash' gets stuck). So you get LibreOffice crashed storing recovery information dialog. You process OK. LibreOffice is gone from screen, Except the process still lingering in task manager (soffice.bin). Launching LibreOffice again, while soffice.bin still active in the background causes a bootloop. Splash screen, splash screen etc. until the older soffice.bin process lingering in the task manager is terminated manually. Looking inside the lingering process with VerySleepy shows process SalUserEventList::isFrameAlive looping/running (25% CPU usage). Not that they tool is reliable for proper/clean stacks. But looks like for searching for a 'frame' which killed Which I somehow connect with https://bugs.documentfoundation.org/show_bug.cgi?id=138022 (as it start somewhere with Skia implementation). But that's a n00b assessment not to be taken seriously :P Or a deadlock issue or something like that. But as stated at the beginning.. I think I know something but probably know squad.
I just checked desktop/win32/source/loader.cxx, the LO watchdog process. It only does a loop of MsgWaitForMultipleObjects(1, &aProcessInfo.hProcess, ...) and then calls GetExitCodeProcess. I guess the loop won't terminate, while the process is running and then we check the exit code to handle some LO specific exit codes for uncaught exceptions or a normal restart. Interestingly we don't check the return value from GetExitCodeProcess, so won't detect a case of STILL_ACTIVE. Maybe our watchdog is buggy? Can you check the pid to verify, if it's really the old process or a new one and it's actually a startup problem? Not that I yet tried to reproduce this. Just some curiosity. ---------- (In reply to Telesto from comment #26) ... > LibreOffice crashed storing recovery information now.. If a process has really crashed, you have no way to "storing recovery information". These are just saved in some interval, while everything is still ok. Maybe LO does some additional stuff for the "uncaught exception" case, but in any case the process state will be compromised, so no way to store anything useful (except for the process backtrace, if possible, but that's just a bonus for developers). P.S. I guess STR = Steps to Reproduce.
Created attachment 167918 [details] Screenshot Document Recovery (In reply to Jan-Marek Glogowski from comment #27) > I just checked desktop/win32/source/loader.cxx, the LO watchdog process. It > only does a loop of MsgWaitForMultipleObjects(1, &aProcessInfo.hProcess, > ...) and then calls GetExitCodeProcess. I guess the loop won't terminate, > while the process is running and then we check the exit code to handle some > LO specific exit codes for uncaught exceptions or a normal restart. > > Interestingly we don't check the return value from GetExitCodeProcess, so > won't detect a case of STILL_ACTIVE. Maybe our watchdog is buggy? > Can you check the pid to verify, if it's really the old process or a new one > and it's actually a startup problem? > > Not that I yet tried to reproduce this. Just some curiosity. A) See screenshot What I mean with Document Recovery Windows with "graceful" crash. Instead of abruptly terminating.. Off-topic Note: Still not seeing the whole point of that dialog (but really of off topic) * it manages to crash again while saving the recovery info in number of cases.. mostly if the crash is caused by rendering/layout flaw. Kind of hard to keep showcases if this.. people tend to fix the layout flaws (so you never reach the crash of the recovery dialog; so kind of masked). Also speculating this causing lovely recovery issues... * it turns up in 40% of the cases or so (again depending on crasher). The other 60% it simply skips it: the lovely abrupt termination.. -- B) It's the old process not exiting.. (did check pid specifically, but I memory usage 600 mb to different from new started process.. C) The screenshot is based on bug 138718 (which also illustrated the 'deadlock' or whatever is happening). Or I'm I the only one encountering this on my Windows 8.1 system? FWIW: as said the first time crash working properly.. failure when doing same thing again (so only the second and following runs). After reboot everything is back to normal, until the second crash. It isn't my user profile.. nor limited to specific LibO version.. 7.0 branch and up, I think (but maybe even 6.4?)
Oh, I can confirm this :). * Take a known document and steps to crash LO. I used the one in bug 126226 and steps to make it crash (now this one got fixed in master, but any known crash can be used. This bug bothered me for months before finding how to reproduce.). * Document recovery kicks in, and offers to restore the document. * Choose Discard here, this is the most important. * Load the document again and make it crash again. * This second time the recovery won't kick in, but LO keeps the processor 100% loaded. Only way to stop it is to kill via Process Manager. I bibisected this to the same range as comment #18: https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=24973523ba59087185d434396fd614e73d72107f..6a05f8810684024303047ac9105be4ff5ae8c536 Age Commit message (Expand) Author Files Lines 2020-03-04 Fix typo Andrea Gelmini 1 -2/+2 2020-03-04 scroll to the row when putting the cursor in it Caolán McNamara 1 -0/+3 2020-03-04 add iter_previous_sibling Caolán McNamara 3 -0/+16 2020-03-04 uitest: speed up close_doc() Miklos Vajna 1 -15/+13 2020-03-04 fix ToC links give unhelpful url popups Mert Tumer 2 -3/+13 2020-03-04 tdf#125520 enhance internal OLE cloning Armin Le Grand 1 -1/+14 2020-03-04 Fix typo Andrea Gelmini 1 -1/+1 2020-03-04 tdf#129796 junk in reference edit box Noel Grandin 1 -1/+1 2020-03-04 make some symbols private Noel Grandin 5 -25/+19 2020-03-04 We have had C++11 for some time now Tor Lillqvist 4 -20/+20
Created attachment 171755 [details] Process manager after discarding recovery data and crashing LO the second time
Dear Telesto, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug