Created attachment 94778 [details] example-picture of wrong calculated R^2 with trend lines of an x/y-plot. Problem description: Steps to reproduce: 1. take numbers (x, y) and plot them. 2. plot the trend line ans let libreoffice show the correlation coefficient (correct) 3. let the trend line go through zero or another value and get strange R^2's. Current behavior: R^2 can be bigger than 1, if the correlation gets worse. Expected behavior: R^2 is defined as a value between 0 and 1. If the correlation gets worse, the vlaue should become smaller. Operating System: Fedora Version: 4.2.1.1 release
Please provide an ODS that demonstrates the problem. It's always best to provide your data and files so that we can see exactly what you see instead of "take numbers and plot them" -- marking as NEEDINFO. Once you attach the document mark as UNCONFIRMED. Thanks!
Hello e325001, See also Bug 73374 - EDITING: R^2 calculation of trendline with forced intercept is not correct Thank you, Jacques
Created attachment 94829 [details] the .ods file showing the problem Thanks for taking a look into this. Here is an .ods example. And just for your information (because of the other bug you mentioned), I use the following version: Build-ID: 4.2.1.1-1.fc20 Thanks again.
I think you assigned this to yourself on accident - unless you are planning on committing a patch to fix it :)
Created attachment 94831 [details] Test case with R^2 > 1 This test case shows the results that should be calculated given by LINEREG() function. Modify F1 value to modify forced intercept of LINEREG() function. R^2 is calculated in E11.
Please read my previous comment attached to file as follow: This test case shows the results that should be calculated given by *LINEST*() function. Modify F1 value to modify forced intercept of *LINEST*() function. R^2 is calculated in *E12*.
Hello Laurent BP, in your example file the R^2 without forcing an intercept can be calculated correctly (0,405), but if I force the trend line through zero I get R^2= 0,849, and this is a "better" value (both with linest and within the chart), which cannot be the case, correct? And, as I still think, every value smaller 0 or bigger 1 must be false. I have to admit I am no expert, but I googled a bit and found similar problems with exel (just google "r^2 forced zero"). Do you need any more help or confirmation regarding this problem? As written above, I am no expert and do not even understand the Ydec and Y^ in your file. Thanks so far again.
@e325001: thanks for your proposition. (In reply to comment #7) > Hello Laurent BP, > in your example file the R^2 without forcing an intercept can be calculated > correctly (0,405), but if I force the trend line through zero I get R^2= > 0,849, and this is a "better" value (both with linest and within the chart), > which cannot be the case, correct? R^2 value cannot be compared with or without forced intercept. Check attachment 91622 [details] for more details. In first try, I used the formula proposed. > And, as I still think, every value > smaller 0 or bigger 1 must be false. This formula gives wrong results as you mentioned (>1 or <0). So I proposed a new commit https://gerrit.libreoffice.org/#/c/8402/ which use the same formula as LINEST calculate. > I have to admit I am no expert, but I > googled a bit and found similar problems with exel (just google "r^2 forced > zero"). Excel use same formula for R^2 with or without forced intercept and then give false results with forced intercept. Please check Microsoft help about this: http://support.microsoft.com/kb/829249 > Do you need any more help or confirmation regarding this problem? As > written above, I am no expert and do not even understand the Ydec and Y^ in > your file. Ydec is modified value of Y (Ydec=Y - forced_intercept), because LINEST function can only forced intercept through 0. Y^ is estimated value of Y with equation.
Laurent Balland-Poirier committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=00cb825ab3f16a94f1e8311ba0c24f72588e788e fdo#75538 R^2 calculation for trendline similar to LINEST function The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Laurent Balland-Poirier committed a patch related to this issue. It has been pushed to "libreoffice-4-2": http://cgit.freedesktop.org/libreoffice/core/commit/?id=cade56eef6d23aa7dd6e374858cf30ff3fcd163b&h=libreoffice-4-2 fdo#75538 R^2 calculation of trendline similar to LINEST function It will be available in LibreOffice 4.2.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Hello, I tested the patch (daily snapshot) and now R^2 is always between 0 and 1, thats good. But the values are strange anyway (?): How is it possible that the R^2 gets better (nearer to 1) if the trend line is forced through a defined value than without forcing it through a value? I thought if it is not forced it takes automatically the "best" possibiliy. My provided .ods file is still valid. If you insert the trendline R^2 is 0,999750 and if you force it through zero R^2 is 0,999896, which is a better value and this is what I cannot understand. Also if you force it through 300 R^2 is 0,014, but if you force it through 600 it is 0,33, which is again a better value, but actually the trendline fits way worse. Maybe I make a mistake, I am no expert, but I think there is still something wrong. Thanks so far :) (In reply to comment #10) > Laurent Balland-Poirier committed a patch related to this issue. > It has been pushed to "libreoffice-4-2": > > http://cgit.freedesktop.org/libreoffice/core/commit/ > ?id=cade56eef6d23aa7dd6e374858cf30ff3fcd163b&h=libreoffice-4-2 > > fdo#75538 R^2 calculation of trendline similar to LINEST function > > > It will be available in LibreOffice 4.2.3. > > The patch should be included in the daily builds available at > http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More > information about daily builds can be found at: > http://wiki.documentfoundation.org/Testing_Daily_Builds > Affected users are encouraged to test the fix and report feedback.
(In reply to comment #11) > Hello, > I tested the patch (daily snapshot) and now R^2 is always between 0 and 1, > thats good. But the values are strange anyway (?): How is it possible that > the R^2 gets better (nearer to 1) if the trend line is forced through a > defined value than without forcing it through a value? I thought if it is > not forced it takes automatically the "best" possibiliy. Please check above my comment 8 and particularly attachment 91622 [details] which explain why you cannot compare R^2 with and without forced intercept. > Also if you force it through 300 R^2 is 0,014, but if you force it through > 600 it is 0,33, which is again a better value, but actually the trendline > fits way worse. I agree with you that forced intercept higher than 324 will result in better R^2. I don't know why, but I don't know better way to calculate R^2. If you have an equation, please let us know. :) > Maybe I make a mistake, I am no expert, but I think there is still something > wrong. > Thanks so far :) As your example is not a common way to use forced intercept, I am closing the bug. If you have another (better) way to calculate R^2, feel free to reopen this bug.