Download it now!
Bug 37923 - Improve Calc precision when subtracting large integers to parity with Excel
Summary: Improve Calc precision when subtracting large integers to parity with Excel
Status: CLOSED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.3.2 release
Hardware: x86 (IA32) Linux (All)
: lowest enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 39293 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-06-04 07:57 UTC by Christopher M. Penalver
Modified: 2020-09-28 23:00 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
openoffice.summation.bug.ods (9.41 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-06-04 07:57 UTC, Christopher M. Penalver
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christopher M. Penalver 2011-06-04 07:57:28 UTC
Created attachment 47525 [details]
openoffice.summation.bug.ods

Downstream bug may be found at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/340051

1) lsb_release -rd
Description: Ubuntu 11.04
Release: 11.04

2) apt-cache policy libreoffice-calc
libreoffice-calc:
  Installed: 1:3.3.2-1ubuntu5
  Candidate: 1:3.3.2-1ubuntu5
  Version table:
 *** 1:3.3.2-1ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-updates/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.3.2-1ubuntu4 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty/main i386 Packages

3) What is expected to happen in LibreOffice Calc via the Terminal:

cd ~/Desktop && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/340051/+attachment/486207/+files/openoffice.summation.bug.ods && localc openoffice.summation.bug.ods

is that cell A1=806515533049393 cell B1=1 cell C1=806515533049393 and cell D1=A1-B1-C1=-1.

4) What happens instead is cell D1 is 0.
Comment 1 Markus Mohrhard 2011-06-04 08:40:19 UTC
Hello Christopher,

this is not a bug but a mathematical problem. Subtracting two nearly identical numbers is an ill-conditioned problem.http://en.wikipedia.org/wiki/Condition_number

It's problem that every program using floating point numbers has and that can't really be solved.
Comment 2 Christopher M. Penalver 2011-06-04 09:47:02 UTC
(In reply to comment #1)
> Hello Christopher,
> 
> this is not a bug but a mathematical problem. Subtracting two nearly identical
> numbers is an ill-conditioned
> problem.http://en.wikipedia.org/wiki/Condition_number
> 
> It's problem that every program using floating point numbers has and that can't
> really be solved.

Markus, thank you for quickly taking a look at this. It is agreed this issue is not an easy one to deal with, and more an issue of "how accurate is it?" versus "is it 100% accurate or not?". However, this was marked as a bug because both Excel and Gnumeric correctly yield -1. Looks like a great opportunity to improve LO! :)
Comment 3 Markus Mohrhard 2011-06-04 14:04:53 UTC
Hello Christopher,

sure it's annoying but if you try with for example Excel the limit is just a bit higher. Try with 1126515533121300 and 1. I'm really surprised that we have this problem a bit earlier but I don't think it is something we should worry about.

There is no real possibility to resolve the general problem. So in my opinion this is annoying but we should not try to fix it.

The only solution would be to use exact numbers instad of floating point numbers, but this will have even bigger performance impact.
Comment 4 Christopher M. Penalver 2011-06-04 23:10:51 UTC
(In reply to comment #3)
> Hello Christopher,
> 
> sure it's annoying but if you try with for example Excel the limit is just a
> bit higher. Try with 1126515533121300 and 1. I'm really surprised that we have
> this problem a bit earlier but I don't think it is something we should worry
> about.

Markus, thank you for your discussion on this issue. As I have been focused on upstreaming Ubuntu bugs, my standard for downstream triaging has been "What occurs in Excel in the same circumstance?" Viewing it this way, ones expectation for this issue is that LO, at least, matches the precision of Excel. Anything more precise than Excel, at best, is an ENHANCEMENT request.

> There is no real possibility to resolve the general problem. So in my opinion
> this is annoying but we should not try to fix it.
> 
> The only solution would be to use exact numbers instad of floating point
> numbers, but this will have even bigger performance impact.

Regarding exact numbers v. floating point this is outside my area of expertise so I defer to your and the communities position as to if, how, and when to move forward on that.
Comment 5 wope 2011-06-18 14:17:19 UTC
@christopher: this is not a bug, as markus wrote

@markus: before excel perform the calculation, it sort the numbers from highest to lowest. This may be enhancement.
Comment 6 Christopher M. Penalver 2011-06-18 21:33:09 UTC
wope, as per my last comment, this is considered a bug for Excel calculation expectation compatibility.
Comment 7 Markus Mohrhard 2011-06-23 19:58:14 UTC
@wope: then you run in other situations into the same problem, there is no solution to this problem, it's still strange that it happens a bit earlier than in excel but that may be due to some conversions in uno

@christopher I still don't think that this is really a bug, we just behave a bit different than excel. I suppose there will always be some differences between calc and excel and not all are bugs. I lowered the importance a bit but you might try to convice Kohei or Rainer that it's more important.
I won't close it but I don't think that we should do anything here.
Comment 8 Owen Genat (retired) 2014-02-04 01:55:51 UTC
Summary edited for clarity.
Comment 9 Owen Genat (retired) 2014-02-04 01:56:31 UTC
*** Bug 39293 has been marked as a duplicate of this bug. ***
Comment 10 m.a.riosv 2014-05-09 23:31:26 UTC
*** Bug 78447 has been marked as a duplicate of this bug. ***
Comment 11 Joel Madero 2014-11-04 03:40:51 UTC
Never confirmed by QA team - moving to UNCONFIRMED.
Comment 12 Joel Madero 2014-11-06 00:44:05 UTC
*** Bug 67026 has been marked as a duplicate of this bug. ***
Comment 13 Robinson Tryon (qubit) 2014-11-11 22:31:50 UTC
(In reply to Markus Mohrhard from comment #7)
> @christopher I still don't think that this is really a bug, we just behave a
> bit different than excel. I suppose there will always be some differences
> between calc and excel and not all are bugs.

If we're not going to regard this as a bug, I think that loss of precision & different behavior from Excel => a good candidate for documentation.

*** This bug has been marked as a duplicate of bug 67026 ***
Comment 14 Christopher M. Penalver 2014-11-11 23:23:11 UTC
Robinson Tryon, thank you for your comment.

I'm fine with this remaining open as a lowest priority enhancement request, given the scope of this report is narrow and well defined, in that it's not increase precision on everything, but an Excel calcuation parity request in a well defined case as noted in the Description, and downstream.

As well, this not being a high priority issue is both expected and understandable. However, being able to seemlessly exchange documents between colleagues using Excel, without the hassle of having to WORKAROUND compatibility issues would be fair here. Especially in light of how compatibility is a focus of the project -> http://www.libreoffice.org/discover/libreoffice/ :
"LibreOffice is compatible with many document formats such as Microsoft® Word, Excel..."

Unfortunately, NEW is not an available Status, and UNCONFIRMED doesn't apply as it's more than confirmed up and downstream, so it's REOPENED.

Despite this, I've placed myself as the QA contact if you would have further questions on the scope of this report.

Thank you for your understanding.
Comment 15 Robinson Tryon (qubit) 2014-11-12 00:10:19 UTC
(In reply to Christopher M. Penalver from comment #14)
> Unfortunately, NEW is not an available Status

NEW makes sense if it's possible/likely to get fixed. My guess is that most Calc devs are going to punt on it, at best.

> UNCONFIRMED doesn't apply

I don't like to see bugs sitting in UNCONFIRMED permanently, but I haven't seen the case made for this bug yet. It's unclear to me that (aside from docs) there is some code to write here.

> as it's more than confirmed up and downstream, so it's REOPENED.

REOPENED is a very specific state that covers bugs that have been patched/marked as FIXED by a dev, and then have been reopened because the fix didn't work or was incomplete. That's not the case here.

> I'm fine with this remaining open as a lowest priority enhancement request,
> given the scope of this report is narrow and well defined, in that it's not
> increase precision on everything, but an Excel calcuation parity request in
> a well defined case as noted in the Description, and downstream.

As Markus noted, this is a pretty small component of compatibility. We're not talking about the difference between, say, 10k and 500k rows, we're talking about some nuances of floating-point math when operating on HUGE (or incredibly *small*) numbers.

> As well, this not being a high priority issue is both expected and
> understandable. However, being able to seemlessly exchange documents between
> colleagues using Excel, without the hassle of having to WORKAROUND
> compatibility issues would be fair here. Especially in light of how
> compatibility is a focus of the project ->
> http://www.libreoffice.org/discover/libreoffice/ :
> "LibreOffice is compatible with many document formats such as Microsoft®
> Word, Excel..."

LibreOffice and MS-Office will never be 100% perfectly compatible. Things like 'seamless' compatibility will be difficult when there are fonts that ship with MS-Office that we aren't legally allowed to distribute, let alone nuances in the implementation of floating-point arithmetic ;-)

If you're looking for precision regarding big or small number arithmetic like this, I think that something like Sage or Octave would be an appropriate software package to use.

> Despite this, I've placed myself as the QA contact if you would have further
> questions on the scope of this report.

Making yourself the QA Contact is great, but I remain skeptical that any dev will pick this up. In fact, Markus explained the problem pretty well:

----
The only solution would be to use exact numbers instad of floating point numbers, but this will have even bigger performance impact.
----

It's pretty clear to me that Markus doesn't think that we should trade performance for increased decimal place precision, and I'm inclined to trust his judgment. I still think this bug should be marked as a dupe and become a documented limitation of LO.

Question: What's you goal here? Do you want to match the behavior of Excel, or just increase the precision of these calculations? The former seems more doable than the latter.

(please change status back to 'UNCONFIRMED' after you leave a comment)

Status -> NEEDINFO
Comment 16 Christopher M. Penalver 2014-11-12 00:17:40 UTC
Robinson Tryon, thanks for your comment.

The scope here is precision parity with Excel for this one situation.

BTW, now the report allows one to set the Status to NEW.
Comment 17 Eike Rathke 2018-10-02 12:01:59 UTC
However, meanwhile since at least 5.3 the result in both D1 and D4 is -1.
Comment 18 b. 2020-09-28 23:00:30 UTC
fine so far, but now try 1,2 instead of 1 in B1 and see '0' as result, 
happens for all values from -2,8125 to +2,8125 except the integers -2, -1, 0, 1, 2
  
may result from different handling of integers (16 significant digits) against floats (rounded / truncated to 15 significant decimal digits), 

that rounding is too hard, maybe it's not necessary anymore, @Mike Kaganski recently fixed an error in the string evaluation, tdf#130725, maybe rounding was a try to hide such errors? then you could now handle integers and floats with equal accuracy and avoid many irritations ...  

or shall i file an enhancement request for that, 'integers and floating point values (decimal fractions) should be displayed and evaluated with equal accuracy to avoid irritation', 

or does anybody know a reason why rounding of floats to less precision than possible is neccessary? 

besides ... ex$el does better ... :-(