Bug 109096 - GEOMEAN gives Err:502 when an argument is zero
Summary: GEOMEAN gives Err:502 when an argument is zero
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Winfried Donkers
URL:
Whiteboard: target:6.0.0
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-13 11:19 UTC by Martin Drozdik
Modified: 2017-09-14 05:31 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Minimal working example (7.53 KB, application/vnd.oasis.opendocument.spreadsheet)
2017-07-13 11:19 UTC, Martin Drozdik
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Drozdik 2017-07-13 11:19:26 UTC
Created attachment 134611 [details]
Minimal working example

Dear developers,

When computing the geometric mean of numbers containing zero, the answer should be also a zero. Now the argument checking procedure is too strict and any zero in the arguments causes an Err:502.

While there is some discussion on whether zero arguments should be allowed:

https://math.stackexchange.com/questions/91443/geometric-mean-of-a-dataset-containing-0s/91445

I think they definitely should. This would be in accordance with the Wikipedia definition of geometric mean and also it would make things more consistent. One can always detect the presence of a zero in arguments, because this is the only situation in which GEOMEAN should produce a zero.

Thank you for your support!
Comment 1 Julien Nabet 2017-07-24 17:44:23 UTC
I suppose since Excel doesn't accept <= 0 values (see https://support.office.com/en-us/article/GEOMEAN-function-db1ac48d-25a5-40a0-ab83-0b38980e40d5), it won't be too on LO for compatibility.

Winfried: thought you might be interested in this one.
Comment 2 Winfried Donkers 2017-07-25 07:24:27 UTC
(In reply to Julien Nabet from comment #1)
> I suppose since Excel doesn't accept <= 0 values (see
> https://support.office.com/en-us/article/GEOMEAN-function-db1ac48d-25a5-40a0-
> ab83-0b38980e40d5), it won't be too on LO for compatibility.
> 
> Winfried: thought you might be interested in this one.

Well, this requires some mathematical explanation.
The geometric mean is defined as 
  ROOTn( a1*a2*...*an )
This definition clearly allows 0 as value for any a.
Unfortunately, on computers the product may well rise above the maximum value that can be stored, particularly with large values and a large number of values.

There as also another definition for geometric mean, which does not have this overflow problem: 
  EXP( ( ln(a1)+ln(a2)+...+ln(an) ) / n )  (with all values > 0)
or 
  -1^(m/n) * EXP( ( ln(|a1|)+ln(|a2|)+...+ln(|an|) ) / n )  (with m values < 0)
This definition is generally used in computer applications, but has the disadvantage that it cannot accept any value of 0.

However, when looking at the first definition, it is clear that with any value being 0 the geometric mean will be 0, regardless of n and other values.
This means that when computing the second algorithm, a result of 0 can be returned at the moment a value of 0 is read.

reference: https://en.wikipedia.org/wiki/Geometric_mean

GEOMEAN is defined in ODFF1.2 without constraints, so changing the behaviour as described directly above will comply with ODFF. The interoperability with Excel is reduced in so far that Calc accepts more than Excel, with Calc being mathematically correct.

It'll be easy to fix.
At the same time I will fix another issue (not yet reported): negative values are not accepted yet.
Comment 3 Winfried Donkers 2017-07-26 05:05:38 UTC
(In reply to Winfried Donkers from comment #2)

My reply in comment #2 was too fast and is not entirely correct:

[...]
> The geometric mean is defined as 
>   ROOTn( a1*a2*...*an )
> This definition clearly allows 0 as value for any a.
Although some data sets containing negative values do produce a result, AFAIK there is no geometric meaning of the result.

[...]
>   -1^(m/n) * EXP( ( ln(|a1|)+ln(|a2|)+...+ln(|an|) ) / n )  (with m values <
> 0)
This definition is not correct. (m/n) is between 0 and 1 and -1^k is only defined for integers.

[...]
> At the same time I will fix another issue (not yet reported): negative
> values are not accepted yet.
I won't given the above.

The fix is easy, but ODFF1.2 needs to be amended too as the current specification in ODFF {EXP( ( ln(a1)+ln(a2)+...+ln(an) ) / n )} can't handle arguments with value 0.
I will submit the fix once the ODFF amendment request has been submitted.
Comment 4 Winfried Donkers 2017-09-08 14:29:01 UTC
(In reply to Winfried Donkers from comment #3)
> 
> The fix is easy, but ODFF1.2 needs to be amended too as the current
> specification in ODFF {EXP( ( ln(a1)+ln(a2)+...+ln(an) ) / n )} can't handle
> arguments with value 0.
> I will submit the fix once the ODFF amendment request has been submitted.

It must have been a bad couple of days when I entered comment #2 and comment #3...

ODFF1.2 mentions ROOTn( a1*a2*...*an ) with a Є N. There is no need at all to amend ODFF1.2.

It's just the dilemma of fully adhering to ODFF (and the mathematic definition) for GEOMEAN or opting for maximum interoperability. Personally I tend to adhering to ODFF, especially as Excel's behaviour can be interpreted as a (minor) bug.

Doing some last tests now on the fix before submitting.
Comment 5 Commit Notification 2017-09-13 16:43:20 UTC
Winfried Donkers committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=65ea925d173db0f319a8ca78855587d089e5270d

tdf#109096 Allow 0 as argument value(s) to GEOMEAN()

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.