Bug 105062 - Firebird: Trying to close Firebird-DB without saving leads to hang of LO
Summary: Firebird: Trying to close Firebird-DB without saving leads to hang of LO
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: x86-64 (AMD64) All
: highest critical
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords: haveBacktrace
Depends on:
Blocks: Database-Firebird
  Show dependency treegraph
 
Reported: 2017-01-03 07:52 UTC by robert
Modified: 2017-10-16 16:54 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
stack trace (680.80 KB, text/plain)
2017-01-05 15:45 UTC, Alex Thurgood
Details
MacOS bt of autorecovery mutex deadlock (25.25 KB, text/plain)
2017-10-16 16:54 UTC, Jan-Marek Glogowski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description robert 2017-01-03 07:52:00 UTC
Open a firebird-database, for example https://bugs.documentfoundation.org/attachment.cgi?id=129916
Open a table of this database.
Write something into the table, change data ...
Close the table.
No try to close the *.odb-file.
The comes a popup if the data could be saved.
Try to save the data. 

On my systems this leads every time to a hang. I could do nothing except to kill LO and restart.

Tested with
Version: 5.4.0.0.alpha0+
Build ID: 2a4cd80abcf9e515d1ce3b3a944b573bdc42bff2
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-12-22_00:18:04
Locale: de-DE (de_DE.UTF-8); Calc: group
Comment 1 Buovjaga 2017-01-05 09:14:52 UTC
Tried both Save and Don't save, but no hang.

Maybe try a newer build to be sure.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha0+
Build ID: 1a58cdf8af1aba52ce0a376666dd7d742234d7cf
CPU Threads: 8; OS Version: Linux 4.8; UI Render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on January 4th 2016
Comment 2 robert 2017-01-05 10:11:49 UTC
Have tested it with
Version: 5.4.0.0.alpha0+
Build ID: a3cf075880db31f77cd0550e0ee25eca931c6a40
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-01-05_01:21:50
Locale: de-DE (de_DE.UTF-8); Calc: group
on OpenSUSE 42.1 64bit rpm Linux.

Same buggy behavior:
Save changes to document “Firebird_3” before closing?
appears.
I click "Save" and LO hangs.
I click "Don't Save" and LO closes the Firebird database file.
I click "Cancel" and get back to the Firebird database file.

I have to kill the process libreofficebase_dev.
Comment 3 Buovjaga 2017-01-05 10:22:26 UTC
Maybe you could grab a debug build and try to get a backtrace of the hang http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@70-TDF-dbg/current/
https://wiki.documentfoundation.org/QA/BugReport/Debug_Information#GNU.2FLinux:_How_to_get_a_backtrace

Note: "If its a hang, you will need to force a crash"
Comment 4 robert 2017-01-05 11:36:15 UTC
(In reply to Buovjaga from comment #3)
> Maybe you could grab a debug build and try to get a backtrace of the hang
> http://dev-builds.libreoffice.org/daily/master/Linux-rpm_deb-x86_64@70-TDF-
> dbg/current/

This build is too big for my internet-connection (6 MBit/s). Will get a faster next month ...
Comment 5 Alex Thurgood 2017-01-05 15:23:21 UTC
Hmm, thought this had been reported by Julien already, but guess not...
Comment 6 Alex Thurgood 2017-01-05 15:39:47 UTC
Confirming on my own master build 540alpha
Comment 7 Alex Thurgood 2017-01-05 15:45:50 UTC
Created attachment 130178 [details]
stack trace

From the trace, it looks like a mutex wait/release lock problem again.
Comment 8 Julien Nabet 2017-10-11 08:09:18 UTC
Following the Jan-Marek's patches about scheduling in the last weeks, any update here with a recent build?
Comment 9 robert 2017-10-14 08:30:49 UTC
(In reply to Julien Nabet from comment #8)
> Following the Jan-Marek's patches about scheduling in the last weeks, any
> update here with a recent build?

I am sorry. Tested with
Version: 6.0.0.0.alpha0+
Build ID: 3b21902aa85df7631c9efb20dd408df005295b22
CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-10-13_22:43:31
Locale: de-DE (de_DE.UTF-8); Calc: group
but nothing changed. Hang after trying to close the database before saving the data.
Comment 10 Julien Nabet 2017-10-14 08:42:54 UTC
With master sources updated today on Debian (gtk3 or kde4 rendering) I fail to reproduce this :-(
Comment 11 Alex Thurgood 2017-10-16 10:12:51 UTC
Not fixed, still reproducible for me on master:

Version: 6.0.0.0.alpha0+
Build ID: 643e9001bff137b6e5a8784d9e1f25a51e0d1644
CPU threads: 4; OS: Mac OS X 10.13; UI render: default; 
Locale: fr-FR (fr_FR.UTF-8); Calc: group
Comment 12 Alex Thurgood 2017-10-16 10:22:54 UTC
Am seeing this when I escape out of the lldb run after LO hangs:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fff6c7beeae libsystem_kernel.dylib`__psynch_mutexwait + 10
libsystem_kernel.dylib`__psynch_mutexwait:
->  0x7fff6c7beeae <+10>: jae    0x7fff6c7beeb8            ; <+20>
    0x7fff6c7beeb0 <+12>: movq   %rax, %rdi
    0x7fff6c7beeb3 <+15>: jmp    0x7fff6c7b676c            ; cerror_nocancel
    0x7fff6c7beeb8 <+20>: retq
Comment 13 Alex Thurgood 2017-10-16 10:28:14 UTC
I don't seem to be able to get a backtrace of any sorts unless I force quit LO from the Dock on OSX.
Comment 14 Julien Nabet 2017-10-16 12:02:34 UTC
Jan-Marek: I don't know if it could be related to scheduling part but perhaps you may have some idea here?
FYI, you must enable experimental features to use Firebird.
Comment 15 robert 2017-10-16 14:04:03 UTC
Seems it isn't a special Firebird-problem. Get the same crash when setting up a database to a Calc-file or a Writer-file. If I change something and don't save it it will hang when trying to close the database.

Tested with Version: 6.0.0.0.alpha0+
Comment 16 Jan-Marek Glogowski 2017-10-16 16:54:28 UTC
Created attachment 137017 [details]
MacOS bt of autorecovery mutex deadlock

Interesting that I couldn't reproduce this on Linux...

So on Mac we have another deadlock on in the DB code:

Two threads wait for mutexes:

* thread #1: tid = 0x18e674, 0x00007fffb17b3c22 libsystem_kernel.dylib`__psynch_mutexwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  thread #5: tid = 0x18e6e7, 0x00007fffb17b3c22 libsystem_kernel.dylib`__psynch_mutexwait + 10, name = 'DocumentEventNotifier'

Thread 5 wants the SolarMutex:
   frame #5: 0x000000010bd41a69 libvcllo.dylib`SalYieldMutex::doAcquire(this=0x00007fad65c4be00, nLockCount=1) + 441 at salinst.cxx:302

(lldb) f 5
....
(lldb) p *this
(SalYieldMutex) $5 = {
  comphelper::GenericSolarMutex = {
    m_nThreadId = 1631860

$ echo "obase=16; 1631860"|bc
18E674
Which happens to be thread 1.

I can't really verify the mutexes held by thread 5, but my guess it's the one thread 1 is waiting for.