Description: In LibreOffice 6.1.2.1, when I create a new ANOVA by Menu/Data/Statistics/ANOVA, if the "Results to" field is on a different sheet than the original data, then the Sum of Squares" value for the "Within Groups" comparison is incorrect. Specifically, the formula for this cell doesn't specify the sheet for the cells referred to. For my data, the formula for this cell was this: =SUM(DEVSQ($A$2:$A$30),DEVSQ($B$2:$B$30),DEVSQ($C$2:$C$30),DEVSQ($D$2:$D$30)) when it should have been this: =SUM(DEVSQ($Sheet3.$A$2:$A$30),DEVSQ($Sheet3.$B$2:$B$30),DEVSQ($Sheet3.$C$2:$C$30),DEVSQ($Sheet3.$D$2:$D$30)) To the user, this usually appears as a "#DIV/0" error, because $A$2:$A$30 is partially empty in the current ANOVA results sheet. Instead the formula needs to refer to $A$2:$A$30 in the data sheet. Steps to Reproduce: 1. Select data 2. Menu/Data/Statistics/ANOVA 3. Set "Results to" field of dialog to any cell on a different sheet than the original data. Actual Results: The "Sum of Squares" (SS) value for the "Within Groups" comparison is "#DIV/0". The formula refers to the sheet the ANOVA is found on. Expected Results: The formula refers to the sheet the originally selected data is found on. Reproducible: Always User Profile Reset: No Additional Info:

Please rephrase the title, to explain in breve the issue.

We need a simple (very simple) test document. Marking as NEEDINFO, once attachment is added set to UNCONFIRMED. Thank you.

Created attachment 148180 [details] Example of ANOVA calculation problem This is an example sheet illustrating the problem. Data is in Sheet1.b1:d30 ANOVA (single factor) was performed with Sheet1.G2 as the output start cell: the generated formula for total variation in Sheet1.H13 is incorrect, producing Err:508 - it should simply be SUM(Sheet1.H11:H12) ANOVA (single factor) was performed again with Sheet2.B1 as the output start cell. Cells C11 and C12 on Sheet2 (highlighted in yellow) contain incorrect formulae, as specified above and in the original comment. This echoes through to dependent cells. ANOVA was also performed on a two factor basis with Sheet2.B19 as the destination. This appears to work correctly.

This should be a very easy fix for anyone who has worked on Scaddins: most of the formulae produced by ANOVA are correct, so there is already a good template for the two that need amending. An alternative for the Total cell - Sheet2.C12 in the example file is to replace =DEVSQ($Sheet1.$B$1:$B$30,$Sheet1.$C$1:$C$30,$Sheet1.$D$1:$D$30) with =DEVSQ(($Sheet1.$B$1:$B$30),($Sheet1.$C$1:$C$30),($Sheet1.$D$1:$D$30)) only the extra brackets need inserting - the range and sheet references are correct although =SUM($Sheet2.$C$10:$C$11) is simpler as a calculation for the same result (In this context Sheet2=OutputSheet Sheet1=DataSheet) Sondra has nailed the amendment needed to the "Withing groups" calculation - refer to the data input sheet in the formula references, rather than omitting it.

I have discovered that the Err508 problem has been addressed: https://bugs.documentfoundation.org/show_bug.cgi?id=80583 That still leaves the problem for output on a different sheet to the input data. The discussion at 80583 is helpful to someone seeking to resolve this bug, because it refers to the code location.

Relevant code is at /sc/source/ui/StatisticsDialogs/AnalysisOfVarianceDialog.cxx Link: https://gerrit.libreoffice.org/plugins/gitiles/core/+/8e1f198ae521c4b99ce5e8b03bf2c72849badbd2/sc/source/ui/StatisticsDialogs/AnalysisOfVarianceDialog.cxx

More specifically, line 309 here: https://gerrit.libreoffice.org/plugins/gitiles/core/+/8e1f198ae521c4b99ce5e8b03bf2c72849badbd2/sc/source/ui/StatisticsDialogs/AnalysisOfVarianceDialog.cxx#309

Dennis Francis, author of the new regression statistics interface to be released in 6.2, seems to think that his version (which also provides ANOVA results) works OK: https://dennisfrancis.wordpress.com/2018/07/03/multivariate-regression-in-calc/#comment-56 He has a slightly different approach to coding, as revealed here: https://cgit.freedesktop.org/libreoffice/core/commit/?id=b7a02f2bb66b990289eb7f4dc80069d1545179a4

Hi Dennis Francis, I thought you could be interested in this issue... Is it already implemented by your multivariate regression work ?

The problem is at https://opengrok.libreoffice.org/xref/core/sc/source/ui/StatisticsDialogs/AnalysisOfVarianceDialog.cxx?r=470a91cb#70 where the ScRefFlags flag is hard-coded to ScRefFlags::RANGE_ABS. Instead it should be decided based on whether the input and output ranges/addresses are in same sheet or not - if they are in the same sheet then ScRefFlags::RANGE_ABS should be fine, else ScRefFlags::RANGE_ABS_3D should be used. I think this bug can be marked as an easy-hack for the new contributors with the above code-pointer to help them.

(In reply to Xisco Faulí from comment #9) > Hi Dennis Francis, > I thought you could be interested in this issue... > Is it already implemented by your multivariate regression work ? Hi Xisco, In the regression dialog, it checks if result sheet is same as the input sheet and uses this info to format the addresses/ranges in all formulae written to the result sheet.

Hi, I'm quite new here, but if I understand it correctly, the only difference between the ScRefFlags::RANGE_ABS and ScRefFlags::RANGE_ABS_3D is just that the ScRefFlags::RANGE_ABS_3D adds $Sheet to the source cell address. So, wouldn't it be enough to just hard code the value to ScRefFlags::RANGE_ABS_3D and let it add the $Sheet to the address every time? The other ANOVA formulas also reference the sheet every time (even if the results are on the same sheet as the sources).

(In reply to Jaromír Wysoglad from comment #12) > Hi, I'm quite new here, but if I understand it correctly, the only > difference between the ScRefFlags::RANGE_ABS and ScRefFlags::RANGE_ABS_3D is > just that the ScRefFlags::RANGE_ABS_3D adds $Sheet to the source cell > address. So, wouldn't it be enough to just hard code the value to > ScRefFlags::RANGE_ABS_3D and let it add the $Sheet to the address every > time? The other ANOVA formulas also reference the sheet every time (even if > the results are on the same sheet as the sources). Yes sure, here it does not matter if you hardcode the *3D flags as you say. For the purposes of fixing the bug it is just fine. Generally it is better to use *3D sparingly when input and output sheets are same. Consider a very common practical use-case where the user wants to do same analysis (especially with a dialog with many controls) on say 10 sheets where in each sheet the input data is different but at the same location and has same shape. Using the dialog the user can get the results for sheet1 on sheet1 and then simply copy the output range to all other sheets at same relative location instead of using the dialog 10 times !

(In reply to Dennis Francis from comment #13) > Yes sure, here it does not matter if you hardcode the *3D flags as you say. > For the purposes of fixing the bug it is just fine. > > Generally it is better to use *3D sparingly when input and output sheets are > same. Consider a very common practical use-case where the user wants to do > same analysis (especially with a dialog with many controls) on say 10 sheets > where in each sheet the input data is different but at the same location and > has same shape. Using the dialog the user can get the results for sheet1 on > sheet1 and then simply copy the output range to all other sheets at same > relative location instead of using the dialog 10 times ! I now understand the benefits of not always using the *3D flag, so I made it to determine if it is needed as you originally sugested.

(In reply to Dennis Francis from comment #13) > Generally it is better to use *3D sparingly when input and output sheets are > same. Consider a very common practical use-case where the user wants to do > same analysis (especially with a dialog with many controls) on say 10 sheets > where in each sheet the input data is different but at the same location and > has same shape. Using the dialog the user can get the results for sheet1 on > sheet1 and then simply copy the output range to all other sheets at same > relative location instead of using the dialog 10 times ! As is, that wouldn't work anyway because all other generated formulas pointing into the ANOVA output written use 3D references. So if we wanted to be able to copy&paste to a different sheet all those references would need to be written without 3D.

Jaromir Wysoglad committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/+/66c08a0d308fd6119460546a906434bbc9d23c2e%5E%21 tdf#121103 ANOVA sum of squares formula correction It will be available in 6.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.

Pending review https://gerrit.libreoffice.org/70153 for 6-2

https://gerrit.libreoffice.org/70155 for 6-1

Jaromir Wysoglad committed a patch related to this issue. It has been pushed to "libreoffice-6-2": https://git.libreoffice.org/core/+/205f9756f70fe32db169d6f81795f4c66382d014%5E%21 tdf#121103 ANOVA sum of squares formula correction It will be available in 6.2.4. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.

Jaromir Wysoglad committed a patch related to this issue. It has been pushed to "libreoffice-6-1": https://git.libreoffice.org/core/+/1dc8088162a66a99217dae3258711cbfa8f9bc28%5E%21 tdf#121103 ANOVA sum of squares formula correction It will be available in 6.1.6. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.

So now that the actual bug is fixed, one could work on a non-3D vs 3D references change, i.e. using 3D references to data only when the calculation is on a different sheet, and within the ANOVA formulas use only non-3D references.

Jaromir Wysoglad committed a patch related to this issue. It has been pushed to "libreoffice-6-2-3": https://git.libreoffice.org/core/+/7708861765dae49a60386774970db21e598be7c1%5E%21 tdf#121103 ANOVA sum of squares formula correction It will be available in 6.2.3. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.