Bug 127517 - Libreoffice Calc Trendline R^2 value calculation goes wrong when trendline is forced through 0
Summary: Libreoffice Calc Trendline R^2 value calculation goes wrong when trendline is...
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Chart (show other bugs)
Version:
(earliest affected)
4.4.7.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Chart-Trendlines
  Show dependency treegraph
 
Reported: 2019-09-12 10:26 UTC by mohr.christoph
Modified: 2020-08-31 07:23 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Example of wrongl R^2 calculation (17.94 KB, application/vnd.oasis.opendocument.spreadsheet)
2019-10-04 08:31 UTC, mohr.christoph
Details
Screenshot of error in 6.4 (126.03 KB, image/png)
2020-01-31 11:19 UTC, mohr.christoph
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mohr.christoph 2019-09-12 10:26:10 UTC
Description:
When plotting a scatter plot and forcing a trendline through zero, the R^2 value is wrong.
I created a screenshot as an example:
In Libreoffice Calc you get those lines and values: https://i.imgur.com/9ZMpKrV.png
The R^2 value of the forced trendline is 0.95 and a lot better than the R^2 value of the unforced line (0.779) 
I plotted the same thing in Microsoft excel:https://i.imgur.com/tdYMbQL.png
As you can see the equations are the same, the R^2 value for the unforced line too, but excel calculates R^2 with the forced trendline correctly 

Steps to Reproduce:
1. Create Scatter Plot
2. Fit a trendline and force it through zero
3. Compare to eg Microsoft excel results
4. See that Libreoffice calc has the wrong value

Actual Results:
Results: R^2 is wrong

Expected Results:
Expected: Correct R^2 value


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Comment 1 Xisco Faulí 2019-10-03 10:58:14 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
(Please note that the attachment will be public, remove any sensitive information before attaching it. 
See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.)

I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
Comment 2 mohr.christoph 2019-10-04 08:31:34 UTC
Created attachment 154749 [details]
Example of wrongl R^2 calculation

Sample file as requested. The correct R^2 value is also included in the file.
Comment 3 mohr.christoph 2020-01-31 11:19:26 UTC
Created attachment 157556 [details]
Screenshot of error in 6.4

Bug is still present. It really is a big problem as trend lines and R^2 values are widely used!
Comment 4 Buovjaga 2020-05-02 16:25:34 UTC
I confirm R^2 value is not 0,9373 in the lower graph
Comment 5 Leyan 2020-08-30 15:50:26 UTC
I had a look in the code, this is deliberate. A specific computation was introduced to reproduce the behavior of the LINEST function, by replacing the difference to the average by the difference to the constant to intercept:

* The original bug is here: https://bugs.documentfoundation.org/show_bug.cgi?id=73374
* The commit made to solve it is here: https://github.com/LibreOffice/core/commit/9e67015be6e9e53e19466117d692de17c18463da

The LINEST function also gives R2=0.985148.

In Excel (Office365), the LINEST function answers 0.985148 and the chart R2 is 0.9373.
Comment 6 Buovjaga 2020-08-30 15:55:53 UTC
(In reply to Leyan from comment #5)
> I had a look in the code, this is deliberate. A specific computation was
> introduced to reproduce the behavior of the LINEST function, by replacing
> the difference to the average by the difference to the constant to intercept:
> 
> * The original bug is here:
> https://bugs.documentfoundation.org/show_bug.cgi?id=73374
> * The commit made to solve it is here:
> https://github.com/LibreOffice/core/commit/
> 9e67015be6e9e53e19466117d692de17c18463da
> 
> The LINEST function also gives R2=0.985148.
> 
> In Excel (Office365), the LINEST function answers 0.985148 and the chart R2
> is 0.9373.

Laurent: can you comment on this?
Comment 7 Laurent BP 2020-08-30 16:28:52 UTC
(In reply to Buovjaga from comment #6)
> Laurent: can you comment on this?
Excel chart is wrong ;-)
Scientific article Attachment 91622 [details] of bug 73374 explain why Excel calculation is wrong: it can give negative R²
In case of forced intercept, a different calculation must be used, as LINEST function does (in LibO and Excel).
Comment 8 mohr.christoph 2020-08-31 07:23:30 UTC
Thank you for taking a look and linking to the paper. Turns out my excel using colleague have wrong numbers. This is still unexpected behavior but als libreoffice is mathematically correct there is nothing that can be done. Thanks again, I will read up on the mathematics and correct my calculations. :)