Description: After selecting x and y data for power regression (y = ax^b) the results for a and b are incorrect. Cells for "Intercept" and "LN(X1)" are erroneous. Steps to Reproduce: X Y ------------ 1 2.7 2 4 3 4.4 4 7.1 5 4.9 6 3.6 7 4 8 0.6 9 1 10 4.3 Actual Results: Confidence level 0.99 Coefficients Intercept 1.57125263268399 LN(X1) -0.310308529671727 Expected Results: a = 4.812672931 b = -0.3103085297 Reproducible: Always User Profile Reset: No Additional Info: a = 4.812672931 b = -0.3103085297
I can reproduce the reported result "Intercept = 1.57... LN(X1) = -0.310..." with both 7.0.6 and 7.2/master: Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 18e5e948dd66e41f17b0a63bf631d98aee84a03b CPU threads: 2; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win Locale: zh-CN (zh_CN); UI: zh-CN Calc: threaded However I know too little about regression analysis to judge whether this is wrong result.
Please can you attach the example file?
Top menu bar: Data > Statistics > Regression. Clicking on the Help button brings up the example tabular x and y data that I used. The help page also has the correct coefficients for linear, logarithmic and power regression. Also I think the cell with the label "LN(X1)" is mislabeled. In the same help page it states that: "Power regression: Find a power curve in the form of y = a.x^b, where a is the coefficient, b is the power that best fits the data." So there shouldn't be any logarithmic operation performed on X. And further, after a power regression has been performed, the result for "LN(X1)" is in fact b or the intercept. The result for "Intercept" on other hand is a mystery. I don't know how it fits into the equation y = a.x^b (In reply to m.a.riosv from comment #2) > Please can you attach the example file?
(In reply to m.a.riosv from comment #2) > Please can you attach the example file? I don't think an example file is really needed, as this regression analysis feature is based on UI, not formulas. Steps to reproduce: 1. Copy the data given in comment #0 to a sheet. Copy the data from Bugzilla page, in Calc choose menu Edit > Paste Special > Paste Unformatted Text, the Text Import dialog should automatically recognize the data as a ten-row, two-column range, and if not, adjust the "Separator Options" to use "Separated by" and "Tab"; 2. Select the data range, choose menu Data > Statistics > Regression. The first two data boxes, X-range and Y-range, should be preset as A1:A10 and B1:B10 (or similar, if you pasted to different place on the sheet), select a cell for the "Results to:" box, A12 for example, then choose "Power Regression" radio button; 3. Press "OK" button, the regression analysis result should be written to cells starting at A12. Find the Intercept and LN(X1) coefficients in cell B38 and B39, respectively.
The following is the dataset that I was working on and which first alerted me to the problem. X Y ---------- 240 57 280 48 320 40 360 35 400 30 500 25 600 20 700 17 800 14 1000 11.5 1200 9.5 1500 8 2000 6.7 2500 5.5 3000 4 4000 3 6000 2 8000 1.2 I used an online power regression calculator (https://keisan.casio.com/exec/system/14059931777261) to get the correct answers. Libre Calc results: a = -1.03774556216783 b = 9.66940331258679 Online calculator results: a = 15825.903 b = -1.03774556
EXP(1.57125263268399) = 4.81267293100676 And that's exactly what the trend line equation shows, when you insert a trend line in a chart of your data.
(In reply to Ming Hua from comment #4) > (In reply to m.a.riosv from comment #2) > > Please can you attach the example file? > I don't think an example file is really needed, as this regression analysis > feature is based on UI, not formulas. No - you can calculate from data and formulas, and that's why it is important to know, how reporter linearizes the problem.
(In reply to edwardsontan from comment #5) > The following is the dataset that I was working on and which first alerted > me to the problem. > > X Y > ---------- > 240 57 > 280 48 > 320 40 > 360 35 > 400 30 > 500 25 > 600 20 > 700 17 > 800 14 > 1000 11.5 > 1200 9.5 > 1500 8 > 2000 6.7 > 2500 5.5 > 3000 4 > 4000 3 > 6000 2 > 8000 1.2 > > I used an online power regression calculator > (https://keisan.casio.com/exec/system/14059931777261) to get the correct > answers. > > Libre Calc results: > a = -1.03774556216783 > b = 9.66940331258679 > > Online calculator results: > a = 15825.903 > b = -1.03774556 My Result: EXP(9.66940331258679) = 15825.9030884824
Just to make clear, how I calculate and understand the regression: f(x) = a * x^b -> ln (f(x)) = ln (a*x^b) = ln(a) + ln(x^b) = ln(a) + b*ln(x) Now you use function() LINEST on natural logarithm of both axis an you get two coefficients: [1] Slope - which is b [2] Intercept - which is ln(a) Hence you need to exp(ln(a)) to get a.
My conclusion - can't see any bug here.
(In reply to Uwe Auer from comment #9) > Just to make clear, how I calculate and understand the regression: > > f(x) = a * x^b > -> ln (f(x)) = ln (a*x^b) = ln(a) + ln(x^b) = ln(a) + b*ln(x) > > Now you use function() LINEST on natural logarithm of both axis an you get > two coefficients: > > [1] Slope - which is b > [2] Intercept - which is ln(a) > > Hence you need to exp(ln(a)) to get a. Thanks for the explanation. But all of these are not easily deductible from either the Data > Statistics > Regression... dialog or the help page [1]. Some improvement on documentation is probably in order. 1. https://help.libreoffice.org/latest/en-US/text/scalc/01/statistics_regression.html
(In reply to Uwe Auer from comment #9) > Just to make clear, how I calculate and understand the regression: > > f(x) = a * x^b > -> ln (f(x)) = ln (a*x^b) = ln(a) + ln(x^b) = ln(a) + b*ln(x) > > Now you use function() LINEST on natural logarithm of both axis an you get > two coefficients: > > [1] Slope - which is b > [2] Intercept - which is ln(a) > > Hence you need to exp(ln(a)) to get a. Thank you Uwe for all your posts and answers! That really clarified the matter. Though I wish libreoffice didn't have to let us do more computations which it could've done for us. Maybe a note in the help page as to how to use the results from the power reg would be helpful :-D
(In reply to Ming Hua from comment #11) > (In reply to Uwe Auer from comment #9) > > Just to make clear, how I calculate and understand the regression: > > > > f(x) = a * x^b > > -> ln (f(x)) = ln (a*x^b) = ln(a) + ln(x^b) = ln(a) + b*ln(x) > > > > Now you use function() LINEST on natural logarithm of both axis an you get > > two coefficients: > > > > [1] Slope - which is b > > [2] Intercept - which is ln(a) > > > > Hence you need to exp(ln(a)) to get a. > Thanks for the explanation. > > But all of these are not easily deductible from either the Data > Statistics > > Regression... dialog or the help page [1]. Some improvement on > documentation is probably in order. > > 1. > https://help.libreoffice.org/latest/en-US/text/scalc/01/ > statistics_regression.html Yes. Would be good to put more details in help page / documentation.
So let us change it to documentation.
Created attachment 172638 [details] Power Regresssion Methods - Comparision of Coefficients I'm not sure whether this is really a pure documentation issue only, but indeed a problem of the "Data -> Statistics -> Regression -> Type: Power Regression" functionality (at least from a users perspective). My comments are based on how I do such power regression, which essentially is based on using: [1] =SLOPE(LN(range_of_Y_values);LN(range_of_X_values)) [2] =EXP(INTERCEPT(LN(range_of_Y_values);LN(range_of_X_values))) according to function LN(f(x)) = LN(a) + b * LN(x) (derived from f(x)=ax^b) As a user I'd expect a "Power" regression analysis to yield the coefficients a and b directly (i.e. without any further exponentiation) just like a power trend line in a chart shows both coefficients correctly (without applying any further math) setting option "[x] Show Equation". So it least when comparing [1] Coefficients in trend line equation in chart --with-- [2] Coefficients in "Data -> Statistics -> Regression -> Type: Power Regression" (see cells J20 and J21 in the sample file) there is a discrepancy. See also attachment.