Bug 121396 - FORECAST.ETS.PI functions produces different results with identical data.
Summary: FORECAST.ETS.PI functions produces different results with identical data.
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.2.0.4 release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Calc-Function Calculate
  Show dependency treegraph
 
Reported: 2018-11-13 15:58 UTC by mikehaag314
Modified: 2020-03-27 11:51 UTC (History)
6 users (show)

See Also:
Crash report or crash signature: ["ScETSForecastCalculation::GetETSPredictionIntervals(boost::intrusive_ptr<ScMatrix> const &,boost::intrusive_ptr<ScMatrix> const &,double)"]


Attachments
Calc spreadsheet showing test case. (9.77 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-11-13 16:02 UTC, mikehaag314
Details
document used for testing (15.74 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-11-14 16:59 UTC, Winfried Donkers (retired)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mikehaag314 2018-11-13 15:58:16 UTC
Description:
Using common timeline dates argument, a common target date argument, and several rows of identical values arguments, several cells with the FORECAST.ETS.PI.ADD() function produce different results. See Steps to Reproduce and the Actual Results.

Steps to Reproduce:
1. Open a new Calc spreadsheet.
2. In cells A1:E1 enter dates 05/10/18 through 05/14/18. These will be the Timeline argument.
3. In cell F1, enter the date 05/20/18. This will be the Target argument.
4. In cells A2:E2, enter 45, 10, 20, 55, and 65. These will be the Values argument.
5. I don't know if it's related to the problem, but on my computer, the Function Wizard crashes Calc if I try to use it for the FORECAST.ETS.PI.ADD function. So, in cell F2, manually enter the function: =FORECAST.ETS.PI.ADD($F$1,A2:E2,$A$1:$E$1,0.95,1,1)
6. Select cells A2:F10, then Control-D to fill several rows.
7. Inspect function output cells F2:F10. The common timeline ($A$1:$E$1) cells and common target cell ($F$1) should be the same, while the values cells should have been automatically adjusted. As a check, cell F10 should be: =FORECAST.ETS.PI.ADD($F$1,A10:E10,$A$1:$E$1,0.95,1,1)
8. Observe that the output is different in cells F2:F10.

Actual Results:
The Steps to Reproduce produce the following output, copied and pasted from my spreadsheet. Dates 05/10/18 through 05/14/18 (Cells $A$1:$E$1) is the common timeline argument, date 05/20/18 ($F$1) is the common target argument, the values arguments are cells A2:E10. and the outputs are cells F2:F10:

05/10/18	05/11/18	05/12/18	05/13/18	05/14/18	05/20/18
45	10	20	55	65	61.2437482284469
45	10	20	55	65	57.4837267613985
45	10	20	55	65	56.8951291510018
45	10	20	55	65	60.7092724189456
45	10	20	55	65	56.5521954423025
45	10	20	55	65	51.0461378492329
45	10	20	55	65	59.3321872557923
45	10	20	55	65	54.4217599183968
45	10	20	55	65	53.2928985999517



Expected Results:
Output in Cells F2:F10 should be identical.


Reproducible: Always


User Profile Reset: No



Additional Info:
Version: 5.2.7.2
Build ID: 1:5.2.7-1+deb9u4
CPU Threads: 4; OS Version: Linux 4.9; UI Render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); Calc: group

[Information automatically included from LibreOffice]
Locale: en-US
Module: SpreadsheetDocument
[Information guessed from browser]
OS: Linux (All)
OS is 64bit: yes
Comment 1 mikehaag314 2018-11-13 16:02:02 UTC
Created attachment 146596 [details]
Calc spreadsheet showing test case.
Comment 2 m.a.riosv 2018-11-13 16:40:16 UTC
I think it's a prediction base on historical data, which I guess it's affected for time when it's calculated.

But in any case I have had a crash doing continuous hard recalc on the sample file.

Please Winfred can you take a look.
Comment 3 Winfried Donkers (retired) 2018-11-13 18:07:31 UTC
The description of the behaviour is correct. The behaviour is intentional.
The Prediction Interval (or better accuracy indicator) is calculated with random numbers and multiple (I think it is 2000) samples.

I fully agree that the current behaviour is not perfect and can be misleading.
(On the other hand it underlines that the accuracy indicator is an approximation.)

I can also inform you that I have been working on a calculation that produces consistent results that are a good approximation of the accuracy ever since the FORECAST.ETS functions were introduced early in 2016. However, it is a very difficult task. I have consulted several mathematicians at a number of universities, but so far no practical (any solution must be suitable to be implemented without severe penalties with respect to performance of Calc) solution has come up.
Currently, I am conferring with a Statistics Section of the Mathematics Faculty that I hadn't contacted before. I keep on looking and hope to succeed eventually.

I have changed the importance of this bug report to enhancement as there is no bug. I will use this bug report/enhancement request to submit improvements to the FORECAST.ETS.PI functions (the Additive and Multiplicative share the same principle).
Comment 4 m.a.riosv 2018-11-13 23:09:39 UTC
Hi Winfried, thanks for the quick answer, there is no any comment about the crash, is it a consequence in relation with your comments?.
Comment 5 Winfried Donkers (retired) 2018-11-14 15:27:17 UTC
(In reply to m.a.riosv from comment #4)
> Hi Winfried, thanks for the quick answer, there is no any comment about the
> crash, is it a consequence in relation with your comments?.

I missed the bit about crashes.
I don't use the function wizard, so have no experience with FORECAST.ETS functions in the wizard. I will investigate.
Comment 6 m.a.riosv 2018-11-14 15:34:49 UTC
Sorry in my case it was not with the wizard but maintaining hard recalc [Ctrl+Shift+F9] pushed.
Comment 7 Xisco Faulí 2018-11-14 16:12:01 UTC
(In reply to m.a.riosv from comment #6)
> Sorry in my case it was not with the wizard but maintaining hard recalc
> [Ctrl+Shift+F9] pushed.

Hi Miguel Angel,
Would you mind reporting the crash in a different issue?
Comment 8 m.a.riosv 2018-11-14 16:55:15 UTC
Ok I'll try later.
Comment 9 Winfried Donkers (retired) 2018-11-14 16:59:36 UTC
Created attachment 146625 [details]
document used for testing

I have not yet been able to reproduce a crash with the attached document, neither with direct (re)calculation (single cells and groups of cells) nor with the function wizard. The attached document is an extended version of the original attachment.
I tried 2 computers, 1 with current master, 1 with version 6.0.5.2. Both computers have openSUSE Leap 15.
I will try tomorrow on a Windows computer.

A note on the document in the attachment: with a data set of 5 samples and a prediction that is 6 steps away from the latest sample (daily values from 10 to 14 May and a prediction for 20 May) the results are way out of reality from a statistical point of view. Also, the data set does not contain at least 2 complete periods, which is needed for ETS. This should not influence the stability of the calculations, it's just that the results are useless.
Comment 10 m.a.riosv 2018-11-14 21:59:24 UTC
I'm not able to reproduce again the crash. Maybe it was in a specific conditions.l

But the crash report link for when it happened it's on top-right of this bug.
Comment 11 Winfried Donkers (retired) 2018-11-15 08:26:44 UTC
(In reply to m.a.riosv from comment #10)
> I'm not able to reproduce again the crash. Maybe it was in a specific
> conditions.l
> 
> But the crash report link for when it happened it's on top-right of this bug.

I have looked at the crash report, but when I can't reproduce the crash, that doesn't help.
I can't reproduce on Windows either.

I will check the source code if all possible values of the random numbers are handled correctly (e.g. no DIV/0 possible), but otherwise I think it is best to handle a crash separate from this bug report (this bug report being the unsatisfactory changing results).
Comment 12 Winfried Donkers (retired) 2018-11-22 11:10:13 UTC
FYI : I created a temporary unit test for FORECAST.ETS.XXX.PI and ran the function more than 630000 times, without a single unexpected result.
I ran the test on a Linux machine with current master, I am not able to run it on a Windows machine.

I will further concentrate purely on finding a way to calculate constant and reliable results for the FORECAST.ETS.XXX.PI-functions
Comment 13 b. 2019-12-28 04:06:46 UTC
as randomness in forcast function is intentional, and as it works that way in actual versions and as nobody is working on it shall we set it to wfm?
Comment 14 Xisco Faulí 2020-03-27 11:51:04 UTC
Hi Miguel,
I think the crash you reproduced in the past it fixed in bug 131380