37923 – Improve Calc precision when subtracting large integers to parity with Excel

Bug 37923 - Improve Calc precision when subtracting large integers to parity with Excel

Summary: Improve Calc precision when subtracting large integers to parity with Excel

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Calc (show other bugs)
Version: (earliest affected)	3.3.2 release
Hardware:	x86 (IA32) Linux (All)

Importance:	lowest enhancement
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Duplicates (1):	39293 (view as bug list)
Depends on:
Blocks:

Reported:	2011-06-04 07:57 UTC by Chris Peñalver
Modified:	2022-03-01 21:00 UTC (History)
CC List:	8 users (show)

See Also:	67026 https://launchpad.net/bugs/340051
Crash report or crash signature:

Attachments
openoffice.summation.bug.ods (9.41 KB, application/vnd.oasis.opendocument.spreadsheet) 2011-06-04 07:57 UTC, Chris Peñalver	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Chris Peñalver 2011-06-04 07:57:28 UTC

Created attachment 47525 [details]
openoffice.summation.bug.ods

Downstream bug may be found at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/340051

1) lsb_release -rd
Description: Ubuntu 11.04
Release: 11.04

2) apt-cache policy libreoffice-calc
libreoffice-calc:
  Installed: 1:3.3.2-1ubuntu5
  Candidate: 1:3.3.2-1ubuntu5
  Version table:
 *** 1:3.3.2-1ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty-updates/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.3.2-1ubuntu4 0
        500 http://us.archive.ubuntu.com/ubuntu/ natty/main i386 Packages

3) What is expected to happen in LibreOffice Calc via the Terminal:

cd ~/Desktop && wget https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/340051/+attachment/486207/+files/openoffice.summation.bug.ods && localc openoffice.summation.bug.ods

is that cell A1=806515533049393 cell B1=1 cell C1=806515533049393 and cell D1=A1-B1-C1=-1.

4) What happens instead is cell D1 is 0.

Comment 1 Markus Mohrhard 2011-06-04 08:40:19 UTC

Hello Christopher,

this is not a bug but a mathematical problem. Subtracting two nearly identical numbers is an ill-conditioned problem.http://en.wikipedia.org/wiki/Condition_number

It's problem that every program using floating point numbers has and that can't really be solved.

Comment 2 Chris Peñalver 2011-06-04 09:47:02 UTC

(In reply to comment #1)
> Hello Christopher,
> 
> this is not a bug but a mathematical problem. Subtracting two nearly identical
> numbers is an ill-conditioned
> problem.http://en.wikipedia.org/wiki/Condition_number
> 
> It's problem that every program using floating point numbers has and that can't
> really be solved.

Markus, thank you for quickly taking a look at this. It is agreed this issue is not an easy one to deal with, and more an issue of "how accurate is it?" versus "is it 100% accurate or not?". However, this was marked as a bug because both Excel and Gnumeric correctly yield -1. Looks like a great opportunity to improve LO! :)

Comment 3 Markus Mohrhard 2011-06-04 14:04:53 UTC

Hello Christopher,

sure it's annoying but if you try with for example Excel the limit is just a bit higher. Try with 1126515533121300 and 1. I'm really surprised that we have this problem a bit earlier but I don't think it is something we should worry about.

There is no real possibility to resolve the general problem. So in my opinion this is annoying but we should not try to fix it.

The only solution would be to use exact numbers instad of floating point numbers, but this will have even bigger performance impact.

Comment 4 Chris Peñalver 2011-06-04 23:10:51 UTC

(In reply to comment #3)
> Hello Christopher,
> 
> sure it's annoying but if you try with for example Excel the limit is just a
> bit higher. Try with 1126515533121300 and 1. I'm really surprised that we have
> this problem a bit earlier but I don't think it is something we should worry
> about.

Markus, thank you for your discussion on this issue. As I have been focused on upstreaming Ubuntu bugs, my standard for downstream triaging has been "What occurs in Excel in the same circumstance?" Viewing it this way, ones expectation for this issue is that LO, at least, matches the precision of Excel. Anything more precise than Excel, at best, is an ENHANCEMENT request.

> There is no real possibility to resolve the general problem. So in my opinion
> this is annoying but we should not try to fix it.
> 
> The only solution would be to use exact numbers instad of floating point
> numbers, but this will have even bigger performance impact.

Regarding exact numbers v. floating point this is outside my area of expertise so I defer to your and the communities position as to if, how, and when to move forward on that.

Comment 5 wope 2011-06-18 14:17:19 UTC

@christopher: this is not a bug, as markus wrote

@markus: before excel perform the calculation, it sort the numbers from highest to lowest. This may be enhancement.

Comment 6 Chris Peñalver 2011-06-18 21:33:09 UTC

wope, as per my last comment, this is considered a bug for Excel calculation expectation compatibility.

Comment 7 Markus Mohrhard 2011-06-23 19:58:14 UTC

@wope: then you run in other situations into the same problem, there is no solution to this problem, it's still strange that it happens a bit earlier than in excel but that may be due to some conversions in uno

@christopher I still don't think that this is really a bug, we just behave a bit different than excel. I suppose there will always be some differences between calc and excel and not all are bugs. I lowered the importance a bit but you might try to convice Kohei or Rainer that it's more important.
I won't close it but I don't think that we should do anything here.

Comment 8 Owen Genat (retired) 2014-02-04 01:55:51 UTC

Summary edited for clarity.

Comment 9 Owen Genat (retired) 2014-02-04 01:56:31 UTC

*** Bug 39293 has been marked as a duplicate of this bug. ***

Comment 10 m_a_riosv 2014-05-09 23:31:26 UTC

*** Bug 78447 has been marked as a duplicate of this bug. ***

Comment 11 Joel Madero 2014-11-04 03:40:51 UTC

Never confirmed by QA team - moving to UNCONFIRMED.

Comment 12 Joel Madero 2014-11-06 00:44:05 UTC

*** Bug 67026 has been marked as a duplicate of this bug. ***

Comment 13 Robinson Tryon (qubit) 2014-11-11 22:31:50 UTC

(In reply to Markus Mohrhard from comment #7)
> @christopher I still don't think that this is really a bug, we just behave a
> bit different than excel. I suppose there will always be some differences
> between calc and excel and not all are bugs.

If we're not going to regard this as a bug, I think that loss of precision & different behavior from Excel => a good candidate for documentation.

*** This bug has been marked as a duplicate of bug 67026 ***

Comment 14 Chris Peñalver 2014-11-11 23:23:11 UTC

Robinson Tryon, thank you for your comment.

I'm fine with this remaining open as a lowest priority enhancement request, given the scope of this report is narrow and well defined, in that it's not increase precision on everything, but an Excel calcuation parity request in a well defined case as noted in the Description, and downstream.

As well, this not being a high priority issue is both expected and understandable. However, being able to seemlessly exchange documents between colleagues using Excel, without the hassle of having to WORKAROUND compatibility issues would be fair here. Especially in light of how compatibility is a focus of the project -> http://www.libreoffice.org/discover/libreoffice/ :
"LibreOffice is compatible with many document formats such as Microsoft® Word, Excel..."

Unfortunately, NEW is not an available Status, and UNCONFIRMED doesn't apply as it's more than confirmed up and downstream, so it's REOPENED.

Despite this, I've placed myself as the QA contact if you would have further questions on the scope of this report.

Thank you for your understanding.

Comment 15 Robinson Tryon (qubit) 2014-11-12 00:10:19 UTC

(In reply to Christopher M. Penalver from comment #14)
> Unfortunately, NEW is not an available Status

NEW makes sense if it's possible/likely to get fixed. My guess is that most Calc devs are going to punt on it, at best.

> UNCONFIRMED doesn't apply

I don't like to see bugs sitting in UNCONFIRMED permanently, but I haven't seen the case made for this bug yet. It's unclear to me that (aside from docs) there is some code to write here.

> as it's more than confirmed up and downstream, so it's REOPENED.

REOPENED is a very specific state that covers bugs that have been patched/marked as FIXED by a dev, and then have been reopened because the fix didn't work or was incomplete. That's not the case here.

> I'm fine with this remaining open as a lowest priority enhancement request,
> given the scope of this report is narrow and well defined, in that it's not
> increase precision on everything, but an Excel calcuation parity request in
> a well defined case as noted in the Description, and downstream.

As Markus noted, this is a pretty small component of compatibility. We're not talking about the difference between, say, 10k and 500k rows, we're talking about some nuances of floating-point math when operating on HUGE (or incredibly *small*) numbers.

> As well, this not being a high priority issue is both expected and
> understandable. However, being able to seemlessly exchange documents between
> colleagues using Excel, without the hassle of having to WORKAROUND
> compatibility issues would be fair here. Especially in light of how
> compatibility is a focus of the project ->
> http://www.libreoffice.org/discover/libreoffice/ :
> "LibreOffice is compatible with many document formats such as Microsoft®
> Word, Excel..."

LibreOffice and MS-Office will never be 100% perfectly compatible. Things like 'seamless' compatibility will be difficult when there are fonts that ship with MS-Office that we aren't legally allowed to distribute, let alone nuances in the implementation of floating-point arithmetic ;-)

If you're looking for precision regarding big or small number arithmetic like this, I think that something like Sage or Octave would be an appropriate software package to use.

> Despite this, I've placed myself as the QA contact if you would have further
> questions on the scope of this report.

Making yourself the QA Contact is great, but I remain skeptical that any dev will pick this up. In fact, Markus explained the problem pretty well:

----
The only solution would be to use exact numbers instad of floating point numbers, but this will have even bigger performance impact.
----

It's pretty clear to me that Markus doesn't think that we should trade performance for increased decimal place precision, and I'm inclined to trust his judgment. I still think this bug should be marked as a dupe and become a documented limitation of LO.

Question: What's you goal here? Do you want to match the behavior of Excel, or just increase the precision of these calculations? The former seems more doable than the latter.

(please change status back to 'UNCONFIRMED' after you leave a comment)

Status -> NEEDINFO

Comment 16 Chris Peñalver 2014-11-12 00:17:40 UTC

Robinson Tryon, thanks for your comment.

The scope here is precision parity with Excel for this one situation.

BTW, now the report allows one to set the Status to NEW.

Comment 17 Eike Rathke (retired, only occasionally showing up) 2018-10-02 12:01:59 UTC

However, meanwhile since at least 5.3 the result in both D1 and D4 is -1.

Comment 18 b. 2020-09-28 23:00:30 UTC

fine so far, but now try 1,2 instead of 1 in B1 and see '0' as result, 
happens for all values from -2,8125 to +2,8125 except the integers -2, -1, 0, 1, 2
  
may result from different handling of integers (16 significant digits) against floats (rounded / truncated to 15 significant decimal digits), 

that rounding is too hard, maybe it's not necessary anymore, @Mike Kaganski recently fixed an error in the string evaluation, tdf#130725, maybe rounding was a try to hide such errors? then you could now handle integers and floats with equal accuracy and avoid many irritations ...  

or shall i file an enhancement request for that, 'integers and floating point values (decimal fractions) should be displayed and evaluated with equal accuracy to avoid irritation', 

or does anybody know a reason why rounding of floats to less precision than possible is neccessary? 

besides ... ex$el does better ... :-(

Comment 19 b. 2021-04-17 18:54:23 UTC

'=806515533049393 +x -806515533049393' with sheet cells or formula with distinct values: 

Excel 2010:       x=0,00 .. 0,93 -> 0, 
                  x=0,94 .. ...  -> values with 0,125 precision, 

calc 7.2.0.0.a0+: x=0,00 .. 0,93 -> 0, 
                  x=0,94 .. 1,06 -> 1, 
            ***   x=1,07 .. 1,93 -> 0,   ***
                  x=1,94 .. 2,06 -> 2, 
            ***   x=2,07 .. 2,81 -> 0,   ***  
                  x=2,82 .. ...  -> values with 0,125 precision, 

besides excels precision is already weak, i'd not say calc is 'to parity' ... reopening,

Comment 20 Roman Kuznetsov 2022-03-01 21:00:50 UTC

in 

Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 2c6f5ebfe69c3031af7b4903637226bd8a3dde62
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: CL Jumbo

I see -1 result as it should be in the file

Closed as WFM now

b. try retest *this problem* yourself and if you still have some your own calculation problems please file *the different* report.