Bug 143534 - Crash in Calc NLP Solver when saving a document in Write.
Summary: Crash in Calc NLP Solver when saving a document in Write.
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.0.6.2 release
Hardware: x86-64 (AMD64) macOS (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0 target:7.2.1
Keywords: haveBacktrace, needsDevEval
Depends on:
Blocks: Solver
  Show dependency treegraph
 
Reported: 2021-07-25 09:06 UTC by Todor Balabanov
Modified: 2021-08-13 13:55 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
First crash report dialog box. (45.29 KB, image/png)
2021-07-25 11:23 UTC, Todor Balabanov
Details
Second crash report dialog box. (50.12 KB, image/png)
2021-07-25 11:24 UTC, Todor Balabanov
Details
Third crash report dialog box. (38.13 KB, image/png)
2021-07-25 11:24 UTC, Todor Balabanov
Details
Calc file with the model for the optimization. (46.24 KB, application/vnd.oasis.opendocument.spreadsheet)
2021-07-25 12:50 UTC, Todor Balabanov
Details
Dummy Write file created, edited, saved, and closed during the optimization. (8.31 KB, application/vnd.oasis.opendocument.text)
2021-07-25 12:51 UTC, Todor Balabanov
Details
bt with debug symbols (6.57 KB, text/plain)
2021-07-25 19:29 UTC, Julien Nabet
Details
bt with debug symbols (5.84 KB, text/plain)
2021-07-31 15:56 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Todor Balabanov 2021-07-25 09:06:46 UTC
Description:
When Calc NLP Solver is started for non-linear optimization, which takes more than an hour (DS-PSO agent) it is working. In parallel, a Write document is opened, edited, saved, and closed. When optimizations finish and "Keep Results" button is pressed Calc crashes. Opening LibreOffice gives a dialog for documents recovery. The optimization results are kept during the crash. Optimization results are not lost, but after the general crash of LibreOffice other open documents are also affected. I have looked at Calc's nlpsolver module and I see some document lock instructions. I suppose that such locking may be causing the crash.

Steps to Reproduce:
1. Open Calc and start NLP solver with DE-PSO agent for long-run optimization.
2. Open Write document, write something, save the document, close the document.
3. Wait for optimization to finish and press the Keep Results button.

Actual Results:
General crash of LibreOffice. 

Expected Results:
Optimization results to be kept and Calc document to be editable and savable.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 7.0.6.2
Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b
CPU threads: 4; OS: Mac OS X 10.13.6; UI render: default; VCL: osx
Locale: en-US (en_BG.UTF-8); UI: en-US
Calc: threaded
Comment 1 Julien Nabet 2021-07-25 09:25:07 UTC
Would it be possible you retrieve a stacktrace (see https://wiki.documentfoundation.org/QA/BugReport/Debug_Information#macOS:_How_to_get_debug_information)?

Also, would you have a ods file (and ideally minimal step by step process) to reproduce this?
Comment 2 Todor Balabanov 2021-07-25 11:23:29 UTC
Created attachment 173831 [details]
First crash report dialog box.
Comment 3 Todor Balabanov 2021-07-25 11:24:03 UTC
Created attachment 173832 [details]
Second crash report dialog box.
Comment 4 Todor Balabanov 2021-07-25 11:24:45 UTC
Created attachment 173833 [details]
Third crash report dialog box.
Comment 5 Todor Balabanov 2021-07-25 11:27:19 UTC
(In reply to Julien Nabet from comment #1)
> Would it be possible you retrieve a stacktrace (see
> https://wiki.documentfoundation.org/QA/BugReport/Debug_Information#macOS:
> _How_to_get_debug_information)?
> 
> Also, would you have a ods file (and ideally minimal step by step process)
> to reproduce this?

It seems it is not a regular Mac OS application crash. I was not giving a report dialog box. I did screenshots of the dialog boxes shown during the crash.
Comment 6 Julien Nabet 2021-07-25 11:49:45 UTC
argh, there's no stacktrace in the screenshots :-(
Would it be possible you attach the ods file?
Of course, don't forget to sanitize it (see https://wiki.documentfoundation.org/QA/Bugzilla/Sanitizing_Files_Before_Submission)
Comment 7 Todor Balabanov 2021-07-25 12:48:01 UTC
(In reply to Julien Nabet from comment #6)
> argh, there's no stacktrace in the screenshots :-(
> Would it be possible you attach the ods file?
> Of course, don't forget to sanitize it (see
> https://wiki.documentfoundation.org/QA/Bugzilla/
> Sanitizing_Files_Before_Submission)

Only these dialogs were shown during the crash. I did a screenshot of each one.

Sure, I will upload the working documents. All the information inside them is public.
Comment 8 Todor Balabanov 2021-07-25 12:50:04 UTC
Created attachment 173834 [details]
Calc file with the model for the optimization.

The NPL Solver does not keep meta information for the optimization model. Target cell and variable cells are marked with background colors.
Comment 9 Todor Balabanov 2021-07-25 12:51:56 UTC
Created attachment 173835 [details]
Dummy Write file created, edited, saved, and closed during the optimization.

It is a dummy file. Any other file can be created, edited, saved, and closed during the optimization process in order to reproduce the crash.
Comment 10 Julien Nabet 2021-07-25 17:09:40 UTC
Now you've provided the file, could you provide minimum step by step process to launch NLPSolver and indicate each parameter/value/option you've chosen to trigger a long treatment so people can give it a try?
Comment 11 Todor Balabanov 2021-07-25 18:06:11 UTC
OK. I will describe the minimal steps. I have tried the same in Windows. It has created a crash report.

crashreport.libreoffice.org/stats/crash_details/9f53626a-95d0-4ee1-9a25-31480f63b6ea
Comment 12 Todor Balabanov 2021-07-25 19:13:04 UTC
I have done short video about this crash: https://youtu.be/HxBAKCE1nwg
Comment 13 Julien Nabet 2021-07-25 19:29:40 UTC
Created attachment 173840 [details]
bt with debug symbols

On pc Debian x86-64 with master sources updated today  (+enable-dbgutil), I got a crash when clicking on "solve" button to launch the process, no need to use Writer and a dummy file.
Comment 14 Julien Nabet 2021-07-25 19:30:30 UTC
I don't know if it's the same pb but let's put this one to NEW since there's something wrong here.

Stephan: thought you might be interested in this one.
Comment 15 Todor Balabanov 2021-07-25 19:42:27 UTC
My LibreOffice installation is an older version than what I am using on Mac. The crash is similar on both platforms. This is my Windows LibreOffice info:

Version: 6.3.3.2 (x64)
Build ID: a64200df03143b798afd1ec74a12ab50359878ed
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: CL
Comment 16 Stephan Bergmann 2021-07-26 12:25:37 UTC
(In reply to Julien Nabet from comment #14)
> Stephan: thought you might be interested in this one.

shrug; as long as the reproduction steps are not written down here I'm more than unlikely to take a look
Comment 17 Julien Nabet 2021-07-26 12:46:59 UTC
Step by step process to reproduce the pbs:
- retrieve https://bugs.documentfoundation.org/attachment.cgi?id=173834 and open thefile
- click on cell M2
- select Tools/Solver
- select radio button "Minimum"
- click just at right of "By changing cells"
- type this:
$G$2:$I$9;$J$2:$K$2
- click "Options" button and check you use "DEPS Evolutionary Algorithm" (it should be the case by default)
- Click "Ok" to close "Options" dialog
- Click on "Solve" button
=> just some seconds after, I got a crash
but if you don't have, you can keep on to try to reproduce the initial crash from Todor
- let the solving treatment processing and open a brand odt file
- type anything
- Save with save as and choose any filename
- Close the odt file
=> LO goes back to the Calc part with solving process
- Click "stop" button
- Click "Ok" button
=> crash
Comment 18 Todor Balabanov 2021-07-26 13:28:04 UTC
Thanks, Julien.

It is possible to have a new bug in the latest version of the source code. I see that parallel working in Write during the optimization process in Calc leads to crashes in older versions of LO. In my case, if I do not work in Write the optimization process goes smooth and without crashes.
Comment 19 Stephan Bergmann 2021-07-26 14:30:59 UTC
(In reply to Julien Nabet from comment #17)
> Step by step process to reproduce the pbs:
> - retrieve https://bugs.documentfoundation.org/attachment.cgi?id=173834 and
> open thefile
> - click on cell M2
> - select Tools/Solver
> - select radio button "Minimum"
> - click just at right of "By changing cells"
> - type this:
> $G$2:$I$9;$J$2:$K$2
> - click "Options" button and check you use "DEPS Evolutionary Algorithm" (it
> should be the case by default)
> - Click "Ok" to close "Options" dialog
> - Click on "Solve" button
> => just some seconds after, I got a crash

That is due to

> java.lang.NullPointerException
>         at net.adaptivebox.deps.behavior.PSGTBehavior.generateBehavior(PSGTBehavior.java:100)
>         at net.adaptivebox.deps.DEPSAgent.generatePoint(DEPSAgent.java:126)
>         at com.sun.star.comp.Calc.NLPSolver.DEPSSolverImpl.solve(DEPSSolverImpl.java:169)

which ultimately gets translated into a C++ com::sun::star::uno::RuntimeException (via a BridgeRuntimeError thrown from Bridge::handle_java_exc, bridges/source/jni_uno/jni_uno2java.cxx, and then via the C++ UNO bridge), thrown from within

>     xSolver->solve();

in ScOptSolverDlg::CallSolver (sc/source/ui/miscdlgs/optsolver.cxx), which remains uncaught.
Comment 20 Todor Balabanov 2021-07-26 14:46:22 UTC
Stephan, I think that this is a new bug, which we (generally me) have introduced last two weeks. Please, take a look at this video: https://youtu.be/HxBAKCE1nwg

The result of the steps shown in this video is written in the crash report: https://crashreport.libreoffice.org/stats/crash_details/9f53626a-95d0-4ee1-9a25-31480f63b6ea

I will fix the bugs with the Java null pointer exceptions. Please, do not waste time with them. The crash shown in the video is something more general and it exists in the code base for a longer time.
Comment 21 Stephan Bergmann 2021-07-26 14:53:37 UTC
(In reply to Todor Balabanov from comment #20)
> Stephan, I think that this is a new bug, which we (generally me) have
> introduced last two weeks. Please, take a look at this video:
> https://youtu.be/HxBAKCE1nwg
> 
> The result of the steps shown in this video is written in the crash report:
> https://crashreport.libreoffice.org/stats/crash_details/9f53626a-95d0-4ee1-
> 9a25-31480f63b6ea
> 
> I will fix the bugs with the Java null pointer exceptions. Please, do not
> waste time with them. The crash shown in the video is something more general
> and it exists in the code base for a longer time.

Just to adjust expectations:  I quickly looked into this issue after Julien provided both (a) a backtrace that suggested the issue /might/ be related to code I know something about (the C++ UNO bridge), and (b) readable instructions how to reproduce the issue.  I'm more than unlikely to look further into this issue if there are no readable instructions how to reproduce the "real" issue (assuming that there is no good reason to deliver a video instead of written instructions) and there is no strong evidence that the "real" issue might be related to code I know something about.
Comment 22 Julien Nabet 2021-07-26 14:55:25 UTC
(In reply to Stephan Bergmann from comment #19)
> ...
> That is due to
> 
> > java.lang.NullPointerException
> >         at net.adaptivebox.deps.behavior.PSGTBehavior.generateBehavior(PSGTBehavior.java:100)
> >         at net.adaptivebox.deps.DEPSAgent.generatePoint(DEPSAgent.java:126)
> >         at com.sun.star.comp.Calc.NLPSolver.DEPSSolverImpl.solve(DEPSSolverImpl.java:169)
> 
> which ultimately gets translated into a C++
> com::sun::star::uno::RuntimeException (via a BridgeRuntimeError thrown from
> Bridge::handle_java_exc, bridges/source/jni_uno/jni_uno2java.cxx, and then
> via the C++ UNO bridge), thrown from within
> 
> >     xSolver->solve();
> 
> in ScOptSolverDlg::CallSolver (sc/source/ui/miscdlgs/optsolver.cxx), which
> remains uncaught.

Thank you! I wonder how you retrieved these info. I put a System.out.println before and after line 100 of PSGTBehavior.java and I confirm this.


(In reply to Todor Balabanov from comment #20)
> Stephan, I think that this is a new bug, which we (generally me) have
> introduced last two weeks. Please, take a look at this video:
> https://youtu.be/HxBAKCE1nwg
> 
> The result of the steps shown in this video is written in the crash report:
> https://crashreport.libreoffice.org/stats/crash_details/9f53626a-95d0-4ee1-
> 9a25-31480f63b6ea
> 
> I will fix the bugs with the Java null pointer exceptions. Please, do not
> waste time with them. The crash shown in the video is something more general
> and it exists in the code base for a longer time.

I may be wrong but I think the crash I got must be first fixed so we're be able to reproduce yours with full debug.
Comment 23 Todor Balabanov 2021-07-26 15:01:58 UTC
Julien sounds reasonable.
Comment 24 Todor Balabanov 2021-07-26 16:36:21 UTC
The same crash in Linux: https://crashreport.libreoffice.org/stats/crash_details/2d5fa6a6-2868-4f8b-8cd8-272a95b749e2
Comment 25 Todor Balabanov 2021-07-26 16:51:29 UTC
It seems the crash in Linux happens because of some locking:

void ScDocShell::LockPaint_Impl(bool bDoc)
{
    if ( !m_pPaintLockData )
        UnlockPaint_Impl(true);                 // now
        UnlockDocument_Impl(0);
    }
}
Comment 26 Julien Nabet 2021-07-31 15:56:18 UTC
Created attachment 174001 [details]
bt with debug symbols

On pc Debian x86-64 with master sources updated today + https://gerrit.libreoffice.org/c/core/+/119744 to avoid my crash, I could reproduce the crash.
I haven't waited for the end of the solving, I've just stopped then press Ok button. (of course after having done the Writer part).
Comment 27 Todor Balabanov 2021-07-31 16:08:55 UTC
The null pointer bug was newly introduced and I hope we have solved it. 

The LockPaint_Impl bug stays in the code base for a much longer time. I did reproduce it in stable releases of LO under Mac OS, Ubuntu, and Windows. It exists in many different builds of LO.
Comment 28 Noel Grandin 2021-08-08 16:35:52 UTC
fix for crash here
   https://gerrit.libreoffice.org/c/core/+/120175
Comment 29 Todor Balabanov 2021-08-08 16:52:28 UTC
Great! Only a single line of code. I will test it on my machine.
Comment 30 Commit Notification 2021-08-08 19:10:35 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ec867a13baaa43791d8bacf4e8c1b96aadb6aa8a

tdf#143534 Crash in Calc NLP Solver

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 31 Commit Notification 2021-08-13 13:55:21 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-2":

https://git.libreoffice.org/core/commit/da1c79205777357d2b22626b4985dfcd7e014236

tdf#143534 Crash in Calc NLP Solver

It will be available in 7.2.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.