Bug Hunting Session
Bug 105062 - Firebird: Trying to close Firebird-DB without saving leads to hang of LO
Summary: Firebird: Trying to close Firebird-DB without saving leads to hang of LO
Status: RESOLVED DUPLICATE of bug 107039
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: x86-64 (AMD64) Mac OS X (All)
: highest critical
Assignee: Tor Lillqvist
URL:
Whiteboard:
Keywords: haveBacktrace
Depends on:
Blocks: Database-Firebird-Default
  Show dependency treegraph
 
Reported: 2017-01-03 07:52 UTC by Robert Großkopf
Modified: 2018-01-01 10:01 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
stack trace (680.80 KB, text/plain)
2017-01-05 15:45 UTC, Alex Thurgood
Details
MacOS bt of autorecovery mutex deadlock (25.25 KB, text/plain)
2017-10-16 16:54 UTC, Jan-Marek Glogowski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Großkopf 2017-01-03 07:52:00 UTC
Open a firebird-database, for example https://bugs.documentfoundation.org/attachment.cgi?id=129916
Open a table of this database.
Write something into the table, change data ...
Close the table.
No try to close the *.odb-file.
The comes a popup if the data could be saved.
Try to save the data. 

On my systems this leads every time to a hang. I could do nothing except to kill LO and restart.

Tested with
Version: 5.4.0.0.alpha0+
Build ID: 2a4cd80abcf9e515d1ce3b3a944b573bdc42bff2
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-12-22_00:18:04
Locale: de-DE (de_DE.UTF-8); Calc: group
Comment 1 Buovjaga 2017-01-05 09:14:52 UTC
Tried both Save and Don't save, but no hang.

Maybe try a newer build to be sure.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha0+
Build ID: 1a58cdf8af1aba52ce0a376666dd7d742234d7cf
CPU Threads: 8; OS Version: Linux 4.8; UI Render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on January 4th 2016
Comment 2 Robert Großkopf 2017-01-05 10:11:49 UTC
Have tested it with
Version: 5.4.0.0.alpha0+
Build ID: a3cf075880db31f77cd0550e0ee25eca931c6a40
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-01-05_01:21:50
Locale: de-DE (de_DE.UTF-8); Calc: group
on OpenSUSE 42.1 64bit rpm Linux.

Same buggy behavior:
Save changes to document “Firebird_3” before closing?
appears.
I click "Save" and LO hangs.
I click "Don't Save" and LO closes the Firebird database file.
I click "Cancel" and get back to the Firebird database file.

I have to kill the process libreofficebase_dev.
Comment 3 Buovjaga 2017-01-05 10:22:26 UTC
Maybe you could grab a debug build and try to get a backtrace of the hang http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@70-TDF-dbg/current/
https://wiki.documentfoundation.org/QA/BugReport/Debug_Information#GNU.2FLinux:_How_to_get_a_backtrace

Note: "If its a hang, you will need to force a crash"
Comment 4 Robert Großkopf 2017-01-05 11:36:15 UTC
(In reply to Buovjaga from comment #3)
> Maybe you could grab a debug build and try to get a backtrace of the hang
> http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@70-TDF-
> dbg/current/

This build is too big for my internet-connection (6 MBit/s). Will get a faster next month ...
Comment 5 Alex Thurgood 2017-01-05 15:23:21 UTC
Hmm, thought this had been reported by Julien already, but guess not...
Comment 6 Alex Thurgood 2017-01-05 15:39:47 UTC
Confirming on my own master build 540alpha
Comment 7 Alex Thurgood 2017-01-05 15:45:50 UTC
Created attachment 130178 [details]
stack trace

From the trace, it looks like a mutex wait/release lock problem again.
Comment 8 Julien Nabet 2017-10-11 08:09:18 UTC
Following the Jan-Marek's patches about scheduling in the last weeks, any update here with a recent build?
Comment 9 Robert Großkopf 2017-10-14 08:30:49 UTC
(In reply to Julien Nabet from comment #8)
> Following the Jan-Marek's patches about scheduling in the last weeks, any
> update here with a recent build?

I am sorry. Tested with
Version: 6.0.0.0.alpha0+
Build ID: 3b21902aa85df7631c9efb20dd408df005295b22
CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-10-13_22:43:31
Locale: de-DE (de_DE.UTF-8); Calc: group
but nothing changed. Hang after trying to close the database before saving the data.
Comment 10 Julien Nabet 2017-10-14 08:42:54 UTC
With master sources updated today on Debian (gtk3 or kde4 rendering) I fail to reproduce this :-(
Comment 11 Alex Thurgood 2017-10-16 10:12:51 UTC
Not fixed, still reproducible for me on master:

Version: 6.0.0.0.alpha0+
Build ID: 643e9001bff137b6e5a8784d9e1f25a51e0d1644
CPU threads: 4; OS: Mac OS X 10.13; UI render: default; 
Locale: fr-FR (fr_FR.UTF-8); Calc: group
Comment 12 Alex Thurgood 2017-10-16 10:22:54 UTC
Am seeing this when I escape out of the lldb run after LO hangs:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fff6c7beeae libsystem_kernel.dylib`__psynch_mutexwait + 10
libsystem_kernel.dylib`__psynch_mutexwait:
->  0x7fff6c7beeae <+10>: jae    0x7fff6c7beeb8            ; <+20>
    0x7fff6c7beeb0 <+12>: movq   %rax, %rdi
    0x7fff6c7beeb3 <+15>: jmp    0x7fff6c7b676c            ; cerror_nocancel
    0x7fff6c7beeb8 <+20>: retq
Comment 13 Alex Thurgood 2017-10-16 10:28:14 UTC
I don't seem to be able to get a backtrace of any sorts unless I force quit LO from the Dock on OSX.
Comment 14 Julien Nabet 2017-10-16 12:02:34 UTC
Jan-Marek: I don't know if it could be related to scheduling part but perhaps you may have some idea here?
FYI, you must enable experimental features to use Firebird.
Comment 15 Robert Großkopf 2017-10-16 14:04:03 UTC
Seems it isn't a special Firebird-problem. Get the same crash when setting up a database to a Calc-file or a Writer-file. If I change something and don't save it it will hang when trying to close the database.

Tested with Version: 6.0.0.0.alpha0+
Comment 16 Jan-Marek Glogowski 2017-10-16 16:54:28 UTC
Created attachment 137017 [details]
MacOS bt of autorecovery mutex deadlock

Interesting that I couldn't reproduce this on Linux...

So on Mac we have another deadlock on in the DB code:

Two threads wait for mutexes:

* thread #1: tid = 0x18e674, 0x00007fffb17b3c22 libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  thread #5: tid = 0x18e6e7, 0x00007fffb17b3c22 libsystem_kernel.dylib`__psynch_mutexwait + 10, name = 'DocumentEventNotifier'

Thread 5 wants the SolarMutex:
   frame #5: 0x000000010bd41a69 libvcllo.dylib`SalYieldMutex::doAcquire(this=0x00007fad65c4be00, nLockCount=1) + 441 at salinst.cxx:302

(lldb) f 5
....
(lldb) p *this
(SalYieldMutex) $5 = {
  comphelper::GenericSolarMutex = {
    m_nThreadId = 1631860

$ echo "obase=16; 1631860"|bc
18E674
Which happens to be thread 1.

I can't really verify the mutexes held by thread 5, but my guess it's the one thread 1 is waiting for.
Comment 17 Tamas Bunth 2017-12-09 12:48:47 UTC
I couldn't reproduce this on Linux either.

Version: 6.1.0.0.alpha0+
Build ID: 0c4b1eae3437358f62bd9e98da0c29d41132204d
CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk3;

Changing affected Hardware to macOS.
Comment 18 Robert Großkopf 2017-12-09 16:56:26 UTC
This isn't a special Firebird-bug.
Could reproduce it with all *.odb-files.

Got the crash with
Version: 5.4.3.2
Build-ID: 92a7159f7e4af62137622921e809f8546db437e5
CPU-Threads: 4; Betriebssystem:Linux 4.4; UI-Render: Standard; VCL: kde4; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group

... but it is reported in bug 107039, too.

I will set this one to duplicate of bug 107039. It is decribed better there for all *.odb-files.

*** This bug has been marked as a duplicate of bug 107039 ***
Comment 19 Tor Lillqvist 2017-12-22 08:55:09 UTC
So this bug is no Mac-specific after all? But a duplicate of a cross-platform bug? How reliable is that analysis?
Comment 20 Tor Lillqvist 2017-12-22 08:58:18 UTC
On the other hand, this bug was originally reported on Linux. Oy vey. Quite the confusion by now. But what else is new...
Comment 21 Tor Lillqvist 2018-01-01 10:01:01 UTC
Playing a bit trying to see what this is about, I did get a hang, but this was the very first time I have even used Base (and I found the UI quite confusing, for instance in the "close the table" step there is very little indication that the File:Close menu entry means to close just a table in a database, not a document (as it normally does)), and I was not able to remember exactly what I had done and reproduce it, so that was fairly useless. But will try more. Or perhaps try the instructions in bug #107039.