Since commit commit b34b8d3372364b3c5043da0357ec69505e8d8602 Author: Tor Lillqvist <tml@iki.fi> Date: Thu Feb 28 19:39:00 2013 +0200 I think this is such a serious problem that an assert() is in order Change-Id: If4273ba0b0a95d314e346e26ce092b108214d898 A non-locked SolarMutex is an "abort and dump core" condition instead of only a warning. This completely breaks ReportBuilder: - report execution - report creation wizard since ReportBuilder has been triggering this warning for several LibreOffice versions already. (It also blocks work on other ReportBuilder bugs unless one locally reverts that commit) Reproduction instructions: 1) Report Execution - Open attachment minimalReportBuilder.odb - In the left pane, click "Reports" - In the lower right pane, double-click "Contacts" (or right-click for menu and choose "open") 2) report creation wizard - Open attachment minimalReportBuilder.odb - In the left pane, click "Reports" - In the upper right pane, double-click "Use Wizard to create report" - In "Available Fields", double-click on ContactID; it goes to "Fields in report" - Click "next" Observed behaviour: failed assert and abort because SolarMutex not locked.
Does that mean that ReportBuilder used to work only by luck then? Or that the condition which I turned into an assert isn't so serious after all? Sigh, I hate this OSL_ENSURE style of "asertions" that aren't actually then that serious, or maybe they are, depending on the case. The code is full of such crack. The *fact* is that in the Android port, some serious and randomish crashes I had in an experimental app went away when I did some changes to the app code and the "SolarMutex not locked" warning stopped being displayed. Of course it is very easy to miss such a warning (among the tons of debug output), so the best way to make sure it doesn't fire is to turn it into an actual assert()... Then one *must* debug the actual location where the assertion fires and fix whatever is wrong. Anyway, sure, I am prepared to revert that commit (or make the assert() used on non-traditional OSes only), but I would like to have the input from some expert here, sberg perhaps?
Anyway, please note that this SolarMutex testing is done only in a dbgutil build, which by definition includes code intended to make logic errors in the code more obvious and help debugging then. So surely ReportBuilder should then be *debugged* to find out this assertion (or the equally serious, but unfortunately easy to ignore, "SolarMutex not locked" warning) fires.
Created attachment 75819 [details] reproduction example
I'd be delighted for ReportBuilder to be debbugged and that it is found why this warning/assert fires, but I have no clue myself. I'm willing to collaborate with someone on this (one Base "expert" and one SolarMutex expert together).
I could not get the ReportBuilder thing to run so far that I would have come across the assert... I get some java.lang.IncompatibleClassChange exception error once I managed to install the report builder extension in a freshly built master LO. I am not really a Java person...
(In reply to comment #5) > I could not get the ReportBuilder thing to run so far that I would have come > across the assert... I get some java.lang.IncompatibleClassChange exception > error I (and Julien Nabet, see comment 8 of bug 61564) get the assert *before* the Java error (we have to comment out the assert to get the Java error). What a mess...
I don't know if there could be a link but when launching Base (with a brand new LO profile and with master sources updated today), I have this: warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon ! warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon ! warn:vcl.control:22802:1:vcl/source/control/button.cxx:2357: No new-style group set on radiobutton, using old-style digging around warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:369: ODsnTypeCollection::implDetermineType : missing the second colon ! warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:344: ODsnTypeCollection::implDetermineType : missing the colon ! warn:legacy.osl:22802:1:dbaccess/source/core/misc/dsntypes.cxx:344: ODsnTypeCollection::implDetermineType : missing the colon ! (I don't even try to open a Base file or something, just launching Base)
I have seen many bug reports in the past about crashes or freezes caused by report builder. It is quite possible that it works only by chance and this might help to track down the root of the problem. Well, I do not know how to debug this.
(In reply to comment #7) > I don't know if there could be a link but when launching Base (with a brand > new LO profile and with master sources updated today), I have this: > (I don't even try to open a Base file or something, just launching Base) How do you launch Base without opening/creating a Base file?
(In reply to comment #5) > I could not get the ReportBuilder thing to run so far that I would have come > across the assert... I get some java.lang.IncompatibleClassChange exception > error That is now solved. > once I managed to install the report builder extension in a freshly > built master LO. Unless you configured, with one of: --disable-extension-integration --disable-ext-report-builder --without-java there should be noting to install?
I cannot anymore reproduce with the "report execution" steps, but can still reproduce with the "report creation wizard" steps.
(In reply to comment #9) > How do you launch Base without opening/creating a Base file? From console in LO sources directory: cd install/program . ./ooenv ./soffice.bin --base If I do this, I've got logs indicated in my comment 7 whereas I haven't opened or created yet a Base file. Of course, I can retest this tonight after my day time job. I ran "./g pull -r && make clean && make dev-install" before going to office this morning.
(In reply to comment #12) > (In reply to comment #9) >> How do you launch Base without opening/creating a Base file? > From console in LO sources directory: > cd install/program > . ./ooenv > ./soffice.bin --base Ah, the Database Wizard; I see.
Lionel Elie Mamane committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=04651da19cbd755c2d9a7d399781433c05f9cb97 fdo#61725 workaround The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to comment #14) > Lionel Elie Mamane committed a patch related to this issue. > It has been pushed to "master": > > http://cgit.freedesktop.org/libreoffice/core/commit/ > ?id=04651da19cbd755c2d9a7d399781433c05f9cb97 > > fdo#61725 workaround Sorry, this was supposed to be a local patch only. Not supposed to be pushed. Reverted it now.
Lionel Elie Mamane committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=61a194fbc1bd2624d46ccd2d71e5f1422ef522f1 Revert "fdo#61725 workaround" The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I don't have a dbgutil build to hand; if someone does, I'd be happy to go through the trace and try to unwind who should be taking the solar mutex & where :-)
Michael: I can give it a new try after my day time job. Do you need I run specific commands or just retrieve a bt with symbols?
In libreoffice-4-1 branch: - Can't reproduce with "report execution": does not abort anymore - Can't test with "report creation wizard" because of bug 65168
Created attachment 80369 [details] console + a bt with symbols on master sources (clang+gcc) On pc Debian x86-64, 1) clang (git retrieved some days ago) + master sources updated today 2) gcc (Debian 4.7.3-4) + master sources updated today I retrieved a lot of traces + 1 bt (BTW: clang is very fast, x2 compared to gcc ?) If you need more info about my config (eg: autogen.input) or would like other tests, just tell me.
Since neither Julien nor me can reproduce anymore, closing as fixed. @julien: the crash when "cancel" is a different problem... Could you please fork it in its own bug? Thanks.
Lionel Elie Mamane committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=c63b74d22d360893bb9e1200f59099ffb7943705 fdo#61725 add SolarMutex until it works The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.