Bug 61725 - REPORTBUILDER cannot execute/wizard-create any report: failed assertion SolarMutex not locked
Summary: REPORTBUILDER cannot execute/wizard-create any report: failed assertion Solar...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
4.1.0.0.alpha0+ Master
Hardware: All All
: high blocker
Assignee: Not Assigned
URL:
Whiteboard: target:4.2.0
Keywords: regression
Depends on: 65168
Blocks: 60953 48056 58371 58805 61564 61726 64279
  Show dependency treegraph
 
Reported: 2013-03-03 05:28 UTC by Lionel Elie Mamane
Modified: 2013-07-03 19:41 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
reproduction example (8.12 KB, application/vnd.oasis.opendocument.database)
2013-03-03 09:18 UTC, Lionel Elie Mamane
Details
console + a bt with symbols on master sources (clang+gcc) (44.98 KB, text/plain)
2013-06-05 19:52 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lionel Elie Mamane 2013-03-03 05:28:28 UTC
Since commit

commit b34b8d3372364b3c5043da0357ec69505e8d8602
Author: Tor Lillqvist <tml@iki.fi>
Date:   Thu Feb 28 19:39:00 2013 +0200

    I think this is such a serious problem that an assert() is in order
    
    Change-Id: If4273ba0b0a95d314e346e26ce092b108214d898

A non-locked SolarMutex is an "abort and dump core" condition instead of only a warning.

This completely breaks ReportBuilder:
 - report execution
 - report creation wizard
since ReportBuilder has been triggering this warning for several LibreOffice versions already. (It also blocks work on other ReportBuilder bugs unless one locally reverts that commit)

Reproduction instructions:
1) Report Execution
   - Open attachment minimalReportBuilder.odb
   - In the left pane, click "Reports"
   - In the lower right pane, double-click "Contacts"
     (or right-click for menu and choose "open")
2) report creation wizard
   - Open attachment minimalReportBuilder.odb
   - In the left pane, click "Reports"
   - In the upper right pane, double-click "Use Wizard to create report"
   - In "Available Fields", double-click on ContactID;
     it goes to "Fields in report"
   - Click "next"


Observed behaviour: failed assert and abort because SolarMutex not locked.
Comment 1 Don't use this account, use tml@iki.fi 2013-03-03 07:19:19 UTC
Does that mean that ReportBuilder used to work only by luck then? Or that the condition which I turned into an assert isn't so serious after all? Sigh, I hate this OSL_ENSURE style of "asertions" that aren't actually then that serious, or maybe they are, depending on the case. The code is full of such crack. 

The *fact* is that in the Android port, some serious and randomish crashes I had in an experimental app went away when I did some changes to the app code and the "SolarMutex not locked" warning stopped being displayed. Of course it is very easy to miss such a warning (among the tons of debug output), so the best way to make sure it doesn't fire is to turn it into an actual assert()... Then one *must* debug the actual location where the assertion fires and fix whatever is wrong.

Anyway, sure, I am prepared to revert that commit (or make the assert() used on non-traditional OSes only), but I would like to have the input from some expert here, sberg perhaps?
Comment 2 Don't use this account, use tml@iki.fi 2013-03-03 07:23:40 UTC
Anyway, please note that this SolarMutex testing is done only in a dbgutil build, which by definition includes code intended to make logic errors in the code more obvious and help debugging then. So surely ReportBuilder should then be *debugged* to find out this assertion (or the equally serious, but unfortunately easy to ignore, "SolarMutex not locked" warning) fires.
Comment 3 Lionel Elie Mamane 2013-03-03 09:18:16 UTC
Created attachment 75819 [details]
reproduction example
Comment 4 Lionel Elie Mamane 2013-03-03 09:22:20 UTC
I'd be delighted for ReportBuilder to be debbugged and that it is found why this warning/assert fires, but I have no clue myself. I'm willing to collaborate with someone on this (one Base "expert" and one SolarMutex expert together).
Comment 5 Don't use this account, use tml@iki.fi 2013-03-03 10:26:41 UTC
I could not get the ReportBuilder thing to run so far that I would have come across the assert... I get some java.lang.IncompatibleClassChange exception error once I managed to install the report builder extension in a freshly built master LO. I am not really a Java person...
Comment 6 Lionel Elie Mamane 2013-03-03 10:37:45 UTC
(In reply to comment #5)
> I could not get the ReportBuilder thing to run so far that I would have come
> across the assert... I get some java.lang.IncompatibleClassChange exception
> error

I (and Julien Nabet, see comment 8 of bug 61564) get the assert *before* the Java error (we have to comment out the assert to get the Java error). What a mess...
Comment 7 Julien Nabet 2013-03-03 12:15:06 UTC
I don't know if there could be a link but when launching Base (with a brand new LO profile and with master sources updated today), I have this:
warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon !
warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon !
warn:vcl.control:22802:1:vcl/source/control/button.cxx:2357: No new-style group set on radiobutton, using old-style digging around
warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon !
warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:344: ODsnTypeCollection::implDetermineType : missing the colon !
warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:344: ODsnTypeCollection::implDetermineType : missing the colon !

(I don't even try to open a Base file or something, just launching Base)
Comment 8 Petr Mladek 2013-03-06 08:50:56 UTC
I have seen many bug reports in the past about crashes or freezes caused by report builder. It is quite possible that it works only by chance and this might help to track down the root of the problem. Well, I do not know how to debug this.
Comment 9 Lionel Elie Mamane 2013-03-06 11:25:50 UTC
(In reply to comment #7)
> I don't know if there could be a link but when launching Base (with a brand
> new LO profile and with master sources updated today), I have this:

> (I don't even try to open a Base file or something, just launching Base)

How do you launch Base without opening/creating a Base file?
Comment 10 Lionel Elie Mamane 2013-03-06 11:31:17 UTC
(In reply to comment #5)
> I could not get the ReportBuilder thing to run so far that I would have come
> across the assert... I get some java.lang.IncompatibleClassChange exception
> error

That is now solved.

> once I managed to install the report builder extension in a freshly
> built master LO.

Unless you configured, with one of:
 --disable-extension-integration
 --disable-ext-report-builder
 --without-java
there should be noting to install?
Comment 11 Lionel Elie Mamane 2013-03-06 11:35:01 UTC
I cannot anymore reproduce with the "report execution" steps, but can still reproduce with the "report creation wizard" steps.
Comment 12 Julien Nabet 2013-03-06 12:37:23 UTC
(In reply to comment #9)
> How do you launch Base without opening/creating a Base file?
From console in LO sources directory:
cd install/program
. ./ooenv
./soffice.bin --base

If I do this, I've got logs indicated in my comment 7 whereas I haven't opened or created yet a Base file.
Of course, I can retest this tonight after my day time job. I ran "./g pull -r && make clean && make dev-install" before going to office this morning.
Comment 13 Lionel Elie Mamane 2013-03-06 13:15:59 UTC
(In reply to comment #12)
> (In reply to comment #9)
>> How do you launch Base without opening/creating a Base file?
> From console in LO sources directory:
> cd install/program
> . ./ooenv
> ./soffice.bin --base

Ah, the Database Wizard; I see.
Comment 14 Commit Notification 2013-03-16 07:50:50 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=04651da19cbd755c2d9a7d399781433c05f9cb97

fdo#61725 workaround



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Lionel Elie Mamane 2013-03-16 08:48:38 UTC
(In reply to comment #14)
> Lionel Elie Mamane committed a patch related to this issue.
> It has been pushed to "master":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=04651da19cbd755c2d9a7d399781433c05f9cb97
> 
> fdo#61725 workaround

Sorry, this was supposed to be a local patch only. Not supposed to be pushed. Reverted it now.
Comment 16 Commit Notification 2013-03-16 08:52:57 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=61a194fbc1bd2624d46ccd2d71e5f1422ef522f1

Revert "fdo#61725 workaround"



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 17 Michael Meeks 2013-06-05 09:33:11 UTC
I don't have a dbgutil build to hand; if someone does, I'd be happy to go through the trace and try to unwind who should be taking the solar mutex & where :-)
Comment 18 Julien Nabet 2013-06-05 09:40:06 UTC
Michael: I can give it a new try after my day time job.
Do you need I run specific commands or just retrieve a bt with symbols?
Comment 19 Lionel Elie Mamane 2013-06-05 16:20:12 UTC
In libreoffice-4-1 branch:

 - Can't reproduce with "report execution": does not abort anymore
 - Can't test with "report creation wizard" because of bug 65168
Comment 20 Julien Nabet 2013-06-05 19:52:26 UTC
Created attachment 80369 [details]
console + a bt with symbols on master sources (clang+gcc)

On pc Debian x86-64,
1) clang (git retrieved some days ago) + master sources updated today
2) gcc (Debian 4.7.3-4) + master sources updated today
I retrieved a lot of traces + 1 bt

(BTW: clang is very fast, x2 compared to gcc ?)

If you need more info about my config (eg: autogen.input) or would like other tests, just tell me.
Comment 21 Lionel Elie Mamane 2013-06-06 04:47:37 UTC
Since neither Julien nor me can reproduce anymore, closing as fixed.

@julien: the crash when "cancel" is a different problem... Could you please fork it in its own bug? Thanks.
Comment 22 Commit Notification 2013-07-03 19:41:33 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c63b74d22d360893bb9e1200f59099ffb7943705

fdo#61725 add SolarMutex until it works



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.