Bug 107039 - Crash on Close after acknowledging save changes - mutex issue (steps in comment 21)
Summary: Crash on Close after acknowledging save changes - mutex issue (steps in comme...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
5.3.1.2 release
Hardware: x86-64 (AMD64) All
: high major
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords: bibisectRequest, regression
: 114032 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-08 20:53 UTC by Matt
Modified: 2017-11-24 22:04 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Database file that causes Base to hang. (40.89 KB, application/vnd.sun.xml.base)
2017-04-08 20:54 UTC, Matt
Details
Apple stack / backtrace (840.13 KB, text/plain)
2017-04-10 07:46 UTC, Alex Thurgood
Details
Relevant procmon output (164.61 KB, text/plain)
2017-06-20 02:58 UTC, Matt
Details
LLDB backtrace after forced quit from Dock (8.89 KB, text/plain)
2017-10-05 13:52 UTC, Alex Thurgood
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matt 2017-04-08 20:53:08 UTC
Description:
Opened database, made changes, and when I closed, it asked to save changes.
I said yes. It hangs. I have to "end task" to get it off my screen. Now, when I open that database, I have to recover it. I make simple change again, close, say "yes" to save changes, and the same thing happens.

Steps to Reproduce:
1.Open attached GarageDatabase2.odb. Recover if necessary.
2.Edit OtherMaintenanceForm. Delete the "Notes" text box.
3.Save and close the form edit window.
4. Press red X in upper right of Base to close down. 
5. Answer appropriately to save changes (not sure why it asks me this...the changes should be saved already since it is a database, but whatever).
6. Base hangs.
7. Changes are lost.

Actual Results:  
Hang forever.

Expected Results:
Close quickly.


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36
Comment 1 Matt 2017-04-08 20:54:02 UTC
Created attachment 132409 [details]
Database file that causes Base to hang.
Comment 2 Matt 2017-04-09 10:47:37 UTC
If instead of the steps to reproduce, I change step 4 and 5 (marked with **), everything works as expected:

1.Open attached GarageDatabase2.odb. Recover if necessary.
2.Edit OtherMaintenanceForm. Delete the "Notes" text box.
3.Save and close the form edit window.
**4. Save the database with the "Save" button in the toolbar.
**5. Press the white/red X to exit base. It doesn't have to ask the question about saving since I just saved.
6. Base exits.
7. Changes are saved.

From what I can tell, if you don't save things before clicking the X in the upper corner of base, and if therefore it has to ask you to save the changes, it hangs.
Comment 3 Alex Thurgood 2017-04-10 07:21:58 UTC
No repro with

Version: 5.3.0.3
Build ID: 7074905676c47b82bbcfbea1aeefc84afe1c50e1
Threads CPU : 2; Version de l'OS :Mac OS X 10.12.3; UI Render : par défaut; Moteur de mise en page : nouveau; 
Locale : fr-FR (fr_FR.UTF-8); Calc: group
Comment 4 Alex Thurgood 2017-04-10 07:42:28 UTC
Confirming with 

Version: 5.3.1.2
Build ID: e80a0e0fd1875e1696614d24c32df0f95f03deb2
CPU Threads: 2; OS Version: Mac OS X 10.12.3; UI Render: default; Layout Engine: new; 
Locale: fr-FR (fr_FR.UTF-8); Calc: group

Enclosing trace
Comment 5 Alex Thurgood 2017-04-10 07:42:52 UTC
regression
Comment 6 Alex Thurgood 2017-04-10 07:44:31 UTC
mutex release/continue problem, judging by Apple stack trace
Comment 7 Alex Thurgood 2017-04-10 07:46:38 UTC
Created attachment 132439 [details]
Apple stack / backtrace
Comment 8 Alex Thurgood 2017-04-10 07:49:41 UTC
 Thread 0x149              56 samples (1-56)         priority 81 (base 81)
  <IO tier 0>
 *56  call_continuation + 23 (kernel + 658167) [0xffffff80002a0af7] 1-56
   *56  ??? (kernel + 3675647) [0xffffff80005815ff] 1-56
     *56  ??? (kernel + 5813123) [0xffffff800078b383] 1-56
       *56  lck_mtx_sleep + 132 (kernel + 1049444) [0xffffff8000300364] 1-56
         *56  thread_block_reason + 222 (kernel + 1091230) [0xffffff800030a69e] 1-56
           *56  ??? (kernel + 1095803) [0xffffff800030b87b] 1-56
             *56  machine_switch_context + 206 (kernel + 2102494) [0xffffff80004014de] 1-56
Comment 9 Alex Thurgood 2017-04-10 07:56:35 UTC
Ignore comment 8, wrong process, pasted below the mutex lock/wait on soffice process :


56  _pthread_mutex_lock_slow + 285 (libsystem_pthread.dylib + 5769) [0x7fff8bd94689] 1-56
                                                                                                                                                                            56  __psynch_mutexwait + 10 (libsystem_kernel.dylib + 105654) [0x7fff8bcadcb6] 1-56
                                                                                                                                                                             *56  psynch_mtxcontinue + 0 (pthread + 31396) [0xffffff7f80efaaa4] 1-56
Comment 10 Alex Thurgood 2017-04-10 07:59:46 UTC
With the activity that occurs beforehand :
56  _os_activity_initiate + 61 (libsystem_trace.dylib + 23613) [0x7fff8bdb1c3d] 1-56
56  -[NSApplication terminate:] + 773 (AppKit + 2391190) [0x7fff742b3c96] 1-56
56  -[NSApplication _shouldTerminate] + 843 (AppKit + 2393591) [0x7fff742b45f7] 1-56
56  -[NSDocumentController(NSInternal) __closeAllDocumentsWithDelegate:shouldTerminateSelector:] + 307 (AppKit + 2394534) [0x7fff742b49a6] 1-56
56  -[NSDocumentController(NSInternal) _closeAllDocumentsWithDelegate:shouldTerminateSelector:] + 1318 (AppKit + 2395891) [0x7fff742b4ef3] 1-56
56  __91-[NSDocumentController(NSInternal) _closeAllDocumentsWithDelegate:shouldTerminateSelector:]_block_invoke + 567 (AppKit + 2396884) [0x7fff742b52d4] 1-56
56  -[NSApplication _docController:shouldTerminate:] + 71 (AppKit + 2397216) [0x7fff742b5420] 1-56
Comment 11 Xisco Faulí 2017-06-19 18:43:55 UTC
I can't reproduce it in

Version: 5.3.3.2
Build ID: 1:5.3.3~rc2-0ubuntu0.16.10.1~lo0
CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; VCL: gtk3; Layout Engine: new; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

nor in

Version: 6.0.0.0.alpha0+
Build ID: 08f6f9dded1b142b858c455da03319abac691655
CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

Could you please check again if it's fixed already? Otherwise, is it an only MAC issue ?
Comment 12 Matt 2017-06-19 20:16:49 UTC
I am running 5.3.1.2 and I can reproduce it.

1.Open attached GarageDatabase2.odb. Recover if necessary.
2.Edit OtherMaintenanceForm. Delete something.
3.Save (click disk icon) and close the form edit window (File | Close).
4. Press red X in upper right of Base to close down. 
5. Answer appropriately to save changes (not sure why it asks me this...the changes should be saved already since it is a database, but whatever).
6. Base hangs.
7. Changes are lost.

A couple of times it worked, but it still often reproduces the bug.

I'm running on Windows 7 Pro, NOT MAC.
Comment 13 Xisco Faulí 2017-06-19 20:59:28 UTC
Could you please try to reproduce it with a master build from http://dev-builds.libreoffice.org/daily/master/ ?
You can install it alongside the standard version.
Comment 14 Matt 2017-06-19 21:11:46 UTC
Tested on 5.3.3.2 (x64), reproduced the same bug.
Comment 15 Matt 2017-06-20 02:35:24 UTC
Tested version
/daily/master/Win-x86_64@42/2017-06-19_02.13.03
libo-master64~2017-06-19_02.13.03_LibreOfficeDev_6.0.0.0.alpha0_Win_x64.msi 
By the way, this took an exceedingly long time to download.

aka Version: 6.0.0.0.alpha0+ (x64)

Build ID: 493407c470e7051a801e2a0ad8253f7b87c4434f
CPU threads: 8; OS: Windows 6.1; UI render: GL; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2017-06-19_02:13:03
Locale: en-US (en_US); Calc: CL

The bug happens in this version too.
Comment 16 Matt 2017-06-20 02:58:09 UTC
Created attachment 134149 [details]
Relevant procmon output
Comment 17 Matt 2017-06-20 03:00:29 UTC
See soffice.procmon.log. 

I cannot deciper what is happening between 10:53:19 (when I hit the red x to close base) and 10:53:28.

At that point it seems that every 10 seconds, soffice.bin is writing or trying to write to file database.odb.lck. It just keeps doing that forever.
Comment 18 Xisco Faulí 2017-06-20 07:18:28 UTC
Hi Matt,
Thanks for testing it again. Maybe I did something wrong when I tested it.
Could someone reproduce it on Linux ?
Comment 19 Matt 2017-06-20 12:41:34 UTC
Tried to test on Linux
Version: 5.1.6.2
Build ID: 1:5.1.6~rc2-0ubuntu1~xenial2
CPU Threads: 2; OS Version: Linux 4.4; UI Render: default; 
Locale: en-US (en_US.UTF-8); Calc: group

When I open the database file (attachment #1), LibreOffice errors out:

General Error.
General input/output error.

So I can't run the test we want, at least not yet.
Comment 20 wdehoog 2017-09-26 11:18:16 UTC
I am having this same problem on windows 10 with version 5.4.1.2 (x64). Using a database connection type: PostgreSQL.
Comment 21 Ferry Toth 2017-09-27 13:29:48 UTC
I can easily reproduce this LibO 5.4.1.2 on Ubuntu:

- Create a new empty database
- Create a new empty table using the wizard

Don't save the modified database now

- Press the close button in the title bar or click <File><Quit>
- Save dialog box appears
- Click OK

LibO hangs forever with 0% CPU usage. Needs to be KILL, TERM won't work.

Start LibO again, recovery starts. Changes to the database structure have been saved (not recovered as the save button shows no changes needed to be saved).

Reproducible: every time

Time needed to reproduce: 2 min
Comment 22 Vladimir Potapov 2017-10-03 08:23:01 UTC
(In reply to Ferry Toth from comment #21)
> - Press the close button in the title bar or click <File><Quit>
> - Save dialog box appears
> - Click OK
> 
> LibO hangs forever with 0% CPU usage. Needs to be KILL, TERM won't work.
I confirm the issue for ROSA Linux with two arch (i586 and X64)
Comment 23 Vladimir Potapov 2017-10-03 08:29:32 UTC
(In reply to Vladimir Potapov from comment #22)
> I confirm the issue for ROSA Linux with two arch (i586 and X64)
for versions 5.4.1.2 and 5.4.2.1
Comment 24 Xisco Faulí 2017-10-03 08:33:32 UTC
Confirmed in

Version: 6.0.0.0.alpha0+
Build ID: 34e8fd7e99489e9f50a512b07c6f3923b358b4d3
CPU threads: 4; OS: Linux 4.10; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

bisecting...
Comment 25 Xisco Faulí 2017-10-03 08:50:18 UTC
The issue is not reproducible in the bisect repositories...
Comment 26 Xisco Faulí 2017-10-03 08:51:26 UTC
Adding Jan-Marek just in case he might have any idea about it...
Comment 27 Alex Thurgood 2017-10-05 13:49:04 UTC
Still happening on :

Version: 6.0.0.0.alpha0+
Build ID: 0cb424fec7389801578085b618c5ad68a98f4637
CPU threads: 4; OS: Mac OS X 10.13; UI render: default; 
Locale: fr-FR (fr_FR.UTF-8); Calc: group
Comment 28 Alex Thurgood 2017-10-05 13:52:51 UTC
Created attachment 136780 [details]
LLDB backtrace after forced quit from Dock

Enclosing LLDB backtrace after forced quit from OSX Dock. 

After the app hangs, I right mouse button clicked on the LO icon in the OSX Dock and chose Force Quit from the context menu. The LO icon then disappears from the Dock, but the main ODB window is left hanging on the screen.

The lldb trace was obtained with the main ODB window still displayed after the app icon had disappeared from the OSX Dock.
Comment 29 Alex Thurgood 2017-10-05 13:56:43 UTC
Notice how :

frame #3: 0x00000001000cabae libuno_sal.dylib.3`::osl_acquireMutex(pMutex=<unavailable>) at mutex.cxx:97
    frame #4: 0x000000017d8aa5bc libfwklo.dylib`osl::Mutex::acquire(this=<unavailable>) at mutex.hxx:56


an attempt is made to acquire a mutex lock on a mutex that doesn't exist...seems like there is a synchronization issue here when the user invokes the Save routine after ordering the main window to close.
Comment 30 Jan-Marek Glogowski 2017-10-05 17:06:35 UTC
(In reply to Alex Thurgood from comment #29)
> Notice how :
> 
> frame #3: 0x00000001000cabae
> libuno_sal.dylib.3`::osl_acquireMutex(pMutex=<unavailable>) at mutex.cxx:97
>     frame #4: 0x000000017d8aa5bc
> libfwklo.dylib`osl::Mutex::acquire(this=<unavailable>) at mutex.hxx:56
> 
> 
> an attempt is made to acquire a mutex lock on a mutex that doesn't
> exist...seems like there is a synchronization issue here when the user
> invokes the Save routine after ordering the main window to close.

That just means the value is optimized out and the function is inlined. This is no problem. Locking in the DB code OTOH is problematic.
Comment 31 Matt 2017-11-24 22:04:28 UTC
*** Bug 114032 has been marked as a duplicate of this bug. ***