In Calc, put "0.0042" to A1, "0.0043" to A2, "=A2-A1" to A3; make sure to see 19 decimal places in A3. Expected result: > 0.0001000000000000000 Actual result: > 0.0000999999999999994 The problem here is the entered string in A1 has been converted to double 0.0042000000000000006, which is not the nearest double-precision value for decimal "0.0042": 0.0041999999999999997 is. The conversion of the entered string to double happens in ImpSvNumberInputScan::StringToDouble, which handles integer and fractional parts of the string separately, and then multiplies the fractional part by a power 10 exponent. As shown, this does not guarantee the closest possible representation, resulting in immediately observable inaccuracies in following calculations. A robust conversion is needed here.
Can't std::from_chars (locale-independent, taking expected format specification, and designed for fastest-possible throughput) be used here instead, given that the string is expected to be positive, consisting only of plain decimal digits and a dot? (ImpSvNumberInputScan::StringToDouble would need to take/convert the string to 8-bit then.)
https://gerrit.libreoffice.org/c/core/+/88854
Unfortunately, std::from_chars is still not available in our baseline, so change from comment 2 is abandoned. Eike: do you have some idea for the time being?
Reproduced in Version: 7.0.0.0.alpha0+ Build ID: 28d844a589e52abfe62dc66b888e78665221ba28 CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; Locale: en-US (en_US.UTF-8); UI-Language: en-US Calc: threaded and LibreOffice 3.3.0 OOO330m19 (Build:6) tag libreoffice-3.3.0.4
*** Bug 130728 has been marked as a duplicate of this bug. ***
Good to see my testcase has made it to to the bug tracker. It would be better if the fraction of the decimal after the . was represented as an uint64, then we wouldn't have this problem. Spreadsheet is for accounts and invoices, and accounts work in base 10. The default should be as such. Shoot me down if you don't agree! I'm sure another spreadsheet will come along and do this eventually.
Would using rtl::math::stringToDouble() there yield better results? If applicable..
(In reply to Eike Rathke from comment #7) > Would using rtl::math::stringToDouble() there yield better results? If > applicable.. Unfortunately no, https://ci.libreoffice.org/job/gerrit_linux_clang_dbgutil/53635/console shows that unit test is failing with it - see "Value must be the nearest representation of decimal 0.0042" message in the log. The method uses the same technique, so no wonder :-( Accurate and performant conversion of string to double is *not* a trivial task, unfortunately.
[1] does not use floating-point operations. It is MIT, so might be used. It is not guaranteed to give the same result as library functions, but my understanding is that the difference should be only in cases where the value is strictly in the middle between two possible representations, i.e. both are closest possible. Might worth a try using as o3tl::from_chars, to have the same signature, to allow easy transition to std:: version when it's available? [1] https://github.com/grzegorz-kraszewski/stringtofloat
https://github.com/achan001/dtoa-fast
Hello Is the actual bug going to be fixed? Unsure if you are familiar with Google Docs Spreadsheet The bug is not reproducible there (tried today 21 Feb 2020). Easy to reproduce in LibreOffice Calc, just type: =0.0043 - 0.0042 There should be no necessity for any user to apply formatting to the field, it should show the result appropriately: 0.0001 If you use any Casio calculator, or even the Windows Calc, or Gnome Calc, it displays the result correctly. Calculators have defaulted to up to accurate 8 decimal places display since 1970s I recall.
Oh we love bug comments that just chime in to tell a second time that something is wrong. Maybe we should limit Calc's precision to 8 decimal digits then.
(In reply to Eike Rathke from comment #12) > Oh we love bug comments that just chime in to tell a second time that > something is wrong. Maybe we should limit Calc's precision to 8 decimal > digits then. I guess I still love bug comments that chime in with a clear understanding of the issue. Would you have this argument with an accountant? An auditor? An actuary? The difficulty you face is, like me you're a comp sci, or software engineering graduate, or post-grad, so for such IT people it makes sense... It's a shame, as we're designing software for users, not comp sci grads. It's simply that it doesn't make logical sense for any users of your product, a spreadsheet, it's not meant to be like Fortran, it's meant to be for users! Casio or any other accountants software or calculator does not display to 8 decimal digits a number like 0.0001000 There are many ways this bug could be fixed, ie this cell A3 that is empty, on a blank new sheet does not have any formatting, so simply format it to display the result of the sum as 0.0001 Speaking frankly, I give up. Google Calc doesn't suffer this bug.
(In reply to Jonny Grant from comment #13) > Speaking frankly, I give up. Google Calc doesn't suffer this bug. Are you sure ? https://webapps.stackexchange.com/questions/80125/why-does-google-spreadsheets-says-zero-is-not-equals-zero https://webapps.stackexchange.com/questions/75073/google-sheets-rounds-numbers-where-not-wanted
(In reply to Oliver Brinzing from comment #14) > (In reply to Jonny Grant from comment #13) > > Speaking frankly, I give up. Google Calc doesn't suffer this bug. > > Are you sure ? Jonny Grant mixes the two problems here in the crusade to achieve Casio precision. The real problem solved here in this specific bug is incorrect conversion between string containing a decimal number, and its closest IEEE 754 double precision representation. And this problem indeed is LibreOffice-specific. But Jonny Grant tries to expand the scope of the bug, which is a wrong thing to do; and then, indeed, the statement that GSheets or Excel or whatever do things differently is wrong.
(In reply to Mike Kaganski from comment #15) > (In reply to Oliver Brinzing from comment #14) > > (In reply to Jonny Grant from comment #13) > > > Speaking frankly, I give up. Google Calc doesn't suffer this bug. > > > > Are you sure ? > > Jonny Grant mixes the two problems here in the crusade to achieve Casio > precision. The real problem solved here in this specific bug is incorrect > conversion between string containing a decimal number, and its closest IEEE > 754 double precision representation. And this problem indeed is > LibreOffice-specific. But Jonny Grant tries to expand the scope of the bug, > which is a wrong thing to do; and then, indeed, the statement that GSheets > or Excel or whatever do things differently is wrong. There may be a misunderstanding. I don't seek casio. I seek my accounant, my broker, my auditor to be able to see 0.001 without them suffering because LibreOffice isn't catering for users. It's a very simple problem, and a simple fix? Mike, Many thanks for filing my bug before I could. It's good you filed it. But the accurate bug description is on the one I filed. https://bugs.documentfoundation.org/show_bug.cgi?id=130728 The screenshot is attached to that. I'll attach the screenshot here for completeness. Easy steps to reproduce, just open any new Google Docs Spreadsheet, and type the cells as per screengrab
Created attachment 158147 [details] Google Docs doesn't have this problem Easy steps to reproduce. Use latest Chrome browser.
(In reply to Jonny Grant from comment #16) > Mike, Many thanks for filing my bug before I could. It's good you filed it. > But the accurate bug description is on the one I filed. I didn't file "your bug". I filed the real problem, that I found when checked your test case; and *this* bug is not about what you would like it to be, but is about what I filed it for. If you want to reduce the maximum precision of default number format, please file *that* request separately; and in that request, this specific testcase (0.043-0.042) is *not* suitable, because it demonstrates a *different* problem (that is handled here).
(In reply to Mike Kaganski from comment #18) > (In reply to Jonny Grant from comment #16) > > Mike, Many thanks for filing my bug before I could. It's good you filed it. > > But the accurate bug description is on the one I filed. > > I didn't file "your bug". I filed the real problem, that I found when > checked your test case; and *this* bug is not about what you would like it > to be, but is about what I filed it for. > > If you want to reduce the maximum precision of default number format, please > file *that* request separately; and in that request, this specific testcase > (0.043-0.042) is *not* suitable, because it demonstrates a *different* > problem (that is handled here). What do you think of the screenshot.
(In reply to Jonny Grant from comment #19) > What do you think of the screenshot. As I mentioned in the AskLibO discussion, when a person claims some level of expertise, the expectations of the other side of the discussion change. I hoped that being "a comp sci", you could understand the essence of the problem here, where LibreOffice incorrectly converts entered text "0.042" into corresponding closest IEEE 754 double precision number. LibreOffice has a bug here, which is identified, and a solution is needed, and is being developed. This has been described here, right in comment 0, so hopefully you could understand that. Taking that into account, if we fix this bug, i.e. will convert "0.042" into the *closest* binary representation (which is still not exactly equal to 0.042: see [1]), then LibreOffice would automatically start showing the *same* result in this example. Which doesn't mean that it would also show expected results everywhere: e.g., see the example I put into the FAQ [2]. So here comes *your* PoV (which *we* understand clearly): that either Calc must calculate that precisely (which requires software calculations with infinite precision, and is clear WONTFIX), or that it at least should not show that to user automatically (unless explicitly told so), i.e. to limit automatic number format precision - which is what I recommended you to file separately. I *hope* that now you realize that we do understand you, and it's time for you to make an effort to understand *us*. [1] https://www.binaryconvert.com/result_double.html?decimal=048046048052050 [2] https://wiki.documentfoundation.org/Faq/Calc/Accuracy
Thank you for your reply Mike, I appreciate your links. For users, personally I don't feel infinite precision is not needed. Just accurate decimal precision, up to 8 would be ideal for most accountants etc we work with. If coded in a soft decimal lib, that's only 27 bits, but I appreciate, you'd have to code this in software, as compilers just generate code for double precision. Could I ask, does your precision patch also resolve this test case? anyway, good to have a second test case. 30.9 30.1 0.799999999999997
(In reply to Jonny Grant from comment #21) > Just accurate decimal precision, up to 8 would be ideal for most accountants etc > we work with. If coded in a soft decimal lib, that's only 27 bits, but I > appreciate, you'd have to code this in software, as compilers just generate > code for double precision. Oh. LibreOffice uses IEEE 754 double precision, which has 53 bits mantissa, and so no less than 15 decimal digits precision. That is much higher than 8 decimal digits that that you look for. You seem to not understand the precision, those ...99999... confusing you.
(In reply to Jonny Grant from comment #21) > Could I ask, does your precision patch also resolve this test case? anyway, > good to have a second test case. > > 30.9 > 30.1 > 0.799999999999997 No, this one will continue to give the same result as Google Sheets and MS Excel (if set to show enough decimals).
Created attachment 158157 [details] google sheets doesn't have the bug
Created attachment 158160 [details] Google Sheets indeed does not have this bug, because this is not a bug Please stop spamming, and learn reading (I wrote that you need to set it to show enough digits); learn using the tool (discover for yourself how to control shown precision); learn what precision is (equivalency of 0.50000... and 0.49999... is useful here, studied in intermediate school); and finally, learn using bug tracker - and stop abusing this issue with unrelated messages. Thanks!
(In reply to Mike Kaganski from comment #23) > (In reply to Jonny Grant from comment #21) > > Could I ask, does your precision patch also resolve this test case? anyway, > > good to have a second test case. > > > > 30.9 > > 30.1 > > 0.799999999999997 > > No, this one will continue to give the same result as Google Sheets and MS > Excel (if set to show enough decimals). (In reply to Mike Kaganski from comment #25) > Created attachment 158160 [details] > Google Sheets indeed does not have this bug, because this is not a bug > > Please stop spamming, and learn reading (I wrote that you need to set it to > show enough digits); learn using the tool (discover for yourself how to > control shown precision); learn what precision is (equivalency of 0.50000... > and 0.49999... is useful here, studied in intermediate school); and finally, > learn using bug tracker - and stop abusing this issue with unrelated > messages. Thanks! I can lead a horse to water, but I can't make it drink. If you're going to force Google Docs to show the wrong number of decimal places - it's not surprising the result shown will be the double precision actual number stored in those bits Mike. Of course, we all know how to set the decimal places. You're right that there is a bug about displaying an illogical and wrong number of decimal places in LibreOffice Calc. https://bugs.documentfoundation.org/show_bug.cgi?id=130728 Feels like this issue won't be fixed, and actually feels like I've been wasting my time here the whole time. I don't see the need to comment further at this stage.
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/f3e7004794eded346d98264d3061f4e4aa80ee0a tdf#130725: use strtod by David M. Gay to make sure we get the nearest It will be available in 7.0.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/1782810f886acd26db211d8fdd7ae8796d203c57 Related: tdf#130725: use strtod also in rtl::math::stringToDouble It will be available in 7.0.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
hello @Mike Kaganski, while digging into calcs fp-math problems i'd say your comment (and sight?) have some weak points: > see the example I put into the FAQ [2]. the '31000,99' sample there is less a problem of the weak precision how values are represented as 'doubles', but of 'catastrophic cancellation' when subtracting two operands of similar magnitude, and lack of appropriate handling of such in LO calc, C++, or whereever. just filling the 'tail' (which is empty reg. bit shifting) with '0' is insufficient. it is not possible to get a better representation of the operand values with 'doubles', but it is! possible to get a better result for the calculation, it requires some effort but it is manageable compared to the effort spent to handele 100th of display attributes or constructing gradients for backgrounds. the same applies to the time example, the result is absolutely only as exact as the precision of its operands, neither calc nor 'Weitz' - which calculates a somewhat different result - have a meaningful compensation for the 'cancellation', thus the result is useless for an increased precision - as it is requested with the formatting to 0.000000 seconds. similar can and will occur in decimal, two pieces measured to 1,05 and 1,04 meters (which implies a precision of +/- 0,5 centimeters) may be equal or different by up to two centimeters. > So here comes *your* PoV (which *we* understand clearly): that either Calc must calculate that precisely (which requires software calculations with infinite precision, and is clear WONTFIX), the effort to get 16 or 17 correct 'significant digits' is much less than 'infinite precision', > or that it at least should not show that to user automatically (unless explicitly told so), i.e. to limit aut omatic number format precision - which is what I recommended you to file separately. to implement more difference between displayed result accessible for the user and 'real value' used for downstream calculation (calc already hides one to two digits from the user) will produce more 'user irritation', not a good idea, the opposite would fit, or to take the real! 0,0001 (0,00010000000000000000) as the result of '=0,0043 - 0,0042' and not the wrong 0,00010000000000000026 as calc acually does, check with: '=RAWSUBTRACT(0,0043-0,0042;0,0001)', that's what i'd recommend, all other ways will end in irritations at other places, 0,00010000000000000026 is not a problem of limited precision to represent 0,0001 as double, it's a problem of how to calculate 0,0043 - 0,0042, and will be worse for e.g. 4.000000000000003E-7 - 4.000000000000002E-7, relative error nearly 6 percent, if you want to hide such errors from the user you are near to reduce each value and calculation result to '1', 1,0000000000000007 - 1,0000000000000004 calculated to 2,22044604925031E-16 instead of 3E-16 is nearly 26% off, thus in decimal you can't even trust the first digit,
(In reply to b. from comment #29) > the '31000,99' sample there is less a problem of the weak precision how > values are represented as 'doubles', but of 'catastrophic cancellation' when > subtracting two operands of similar magnitude, and lack of appropriate > handling of such in LO calc, C++, or whereever. just filling the 'tail' > (which is empty reg. bit shifting) with '0' is insufficient. it is not > possible to get a better representation of the operand values with > 'doubles', but it is! possible to get a better result for the calculation, > it requires some effort but it is manageable compared to the effort spent to > handele 100th of display attributes or constructing gradients for > backgrounds. I am tired of people who declare things they don't understand (and try to substitute other's points, making an impression they raise valid arguments). Just stop spamming already, and send us the patch.
@Mike Kaganski's comment is not 'off topic', for the cause itself it's good when even two stubborn opposites like him and me discuss things with each other, he'd ask for a patch, i'm not a coder therefore i can only provide ideas or recipes, it's only one of plenty possible solutions (to propose the best i'd need calc to calculate with less inaccuracies in e.g. rounding): calculating '=1,0000000000000007 - 1,0000000000000004' is 'difficult' with fp-math and spreadsheets, the result '2,22044604925031E-16' is useless, calculating '=1 - 1 + 0,0000000000000007 - 0,0000000000000004' instead is some more effort, but the result '3.0000000000000004E-16' is much better, rounded to 16 digits it's almost 'decimal accurate', such things don't require 'infinite ... in software ...' but only a change of thinking from 'fp-math is inaccurate, it has to stay that way' to 'ok, there are some difficulties, what is the best way to deal with them' above sample is solveable with a simple formula in a sheet: B2: '1,0000000000000007 (text here and 'value()' in the formula to get 17 digits visible) B3: '1,0000000000000004 B4: '=INT(VALUE(B2))-INT(VALUE(B3))+ROUND(RAWSUBTRACT(VALUE(B2);INT(VALUE(B2)));16)-ROUND(RAWSUBTRACT(VALUE(B3);INT(VALUE(B3)));16)' -> 3,00000000000000000000E-16 if you'd like the formula more general: '=INT(VALUE(B2))-INT(VALUE(B3))+ROUND(RAWSUBTRACT(VALUE(B2);INT(VALUE(B2)));16-LOG10(B2))-ROUND(RAWSUBTRACT(VALUE(B3);INT(VALUE(B3)));16-LOG10(B3))' solves plenty samples discussed in forums and 'ask-sites' as '=1234,12 - 1234' or '=32000,12 - 31000,99'
(In reply to b. from comment #31) 1. I myself marked my response as off-topic, to not clutter the issue with futher unrelated comments. Of course, that doesn't stop you from keeping doing just that here, as you do in multiple other places. OK, I will repeat the labeling again. I reserve the right to keep hoping that someday you might finally realize that your random spam in random places is not welcome. 2. No matter if you are programmer or not. You are wasting everyone's time doing what you do. It's *your* duty to provide at least a proof of correctness of your ideas, if they don't immediately meet support in developers. So no, I'm not going to invent and run systematic tests to see if some random stuff you suggest has flaws or actually improves things. You are using logic fallacies right and left, klike declaring a necessity of "a change of thinking ... to 'ok, there are some difficulties, what is the best way to deal with them'", which is falsely declaring your opponents not having a positive attitude, despite them (me and Eike specifically) having implemented multiple improvement just in that area, and Eike pointed to other things like *actually proven correct* Kahan algorithm. But your phrases may create a perception in a random reader that you suggest some brilliant revolutionary concept, but fall victim of inert developers. Nice way of argument, bravo! If you believe your "ideas or recipes" have any value, then just do the thorough tests. And that is not "I take 5 or 50 chosen problematic cases, and see them improved", but "I get 4 billion floats (32-bit floating IEEE 754 point numbers), and simulate the calculations according to my formula on them, and compare to the raw calculations like Calc does; then use 64-bit doubles to perform the same calculations, and compare the error from the two calculations above, to prove that my idea actually has better result over the whole range of tested numbers. That took some days/weeks of programming, and hours or runtime". That would be a reasonable effort, showing some value in checking your testing procedure, and then checking the viability of the result. Not "I give you something that is obvious to you as useless and wrong, and repeatedly expect you to spend your time think over it, and either test, or describe why that is wrong, which I will happliy ignore again."
Created attachment 169318 [details] testing_subtraction_fails_with_random_values hello @Mike Kaganski, i did - testing - see attached sheet, green: good, yellow: ok, red: bad, not 4 trillion numbers, that wouldn't have been enough as i would have to work through the square of these possibilities for subtractions alone, but a significant sample, about a million problematic cases, randomly assembled, ~96 percent fails in calc (all tasks except the binary trivial ones), 100 percent correct with some 'auxiliary calculating', i think that's meaningful, preliminary note: i appreciate your talents as a programmer and your knowledge about LO calc very much and would like to praise you in the highest terms, and to the same extent your condescending arrogant attitude towards others gets on my nerves to such an extent that i would like to shake you over the head until it falls off, please excuse me for trying relatively persistently to contribute to the correctness of calc's calculation results, they are still a little too weak after 35 years of development, point of contention 1: calc calculates errors, we agree on that, they are caused by the 'floats' and 'doubles', i think also 'agree', 'on which aspect' we can already argue, you think more 'conversion and representation' and thus unavoidable, i see mostly 'cancellation' as the! problem and one can work on that, point 2: are better results possible? you say 'rather not', i say 'yes, see attachment', with 'formula' within the IEEE standard (15 significant digits) 100 percent correct results, calc 'standard': about 96 percent fails, outside the standard (sheet2, 16 digits) the improvements are not yet fully worked out, point 3: you say 'such things, effort, rounding and performance impact? NO!', i say once neatly regulated saves so many stupid questions that the effort is worth it, and the performance of the hardware has increased so much that we can afford some computing time for mathematically correct results (we also waste some on background shading and calculating the positioning of comments), point 4: performance: what is said to users stumbling over fp-inaccuracies? do appropriate rounding, what does that do on performance? kill ... point 5: performance I: i have something in the quiver / in development what is much better than these rounding crutches, i.e. something that achieves the same but is much faster, but while working on it i keep stumbling over things like rounding errors, 4-bit truncation, inaccurate comparisons that hinder the elaboration and verification, so it's not yet ready, besides: IEEE 754 defines calculations with 15 sig.-digit figures? thus your sample '= 0.0043 - 0.0042' means '= 0.00430000000000000 - 0.00420000000000000' and for that a result of 0.0000999999999999994 is correct once you round it to the appr. number of digits and! replace the deviating double representation with that of 0.00010000000000000 besides: i spotted another irritation, '=4/46' displayed as 0,0869565217391304 with 16 decimal digits, and 0,086956521739131 !! *1*, wrong rounding? !! with 15 decimal digits, do you know how come? new bug? regards, b.
I've reported many easy to reproduce issues, I don't recall any being fixed. The reality is that we are probably all software engineers, but we use Calc for accounts, book keeping etc. But Calc doesn't work well for that purpose, because it doesn't calculate correctly decimal places and appropriately round (because it uses floating point). Usually these kind of issues trigger either a fork, or a restart with a fresh spreadsheet, this feels more likely, it's not complicated to create spreadsheet software. Just consider how LLVM was spawned, and even the Linux kernel, it's all scratching an itch. So hopefully a competitor will arrive, with better features and design architecture. Unfortunatly Gnumeric suffers the similar issues, already tried that. Google Sheets works well though. I don't think this reply or others should be marked as Spam. Please don't do this to users who are contributing, thank you for your understanding.
(In reply to Jonny Grant from comment #34) It doesn't matter if someone is contributing or not. It does matter if they are doing something correct or not. Musings in an issue is off-topic. When I do something unrelated, I mark my own "contributions" appropriately (as this my reply). I don't mark the spam in comment 34 as spam, though, because of explicit request. I just mark it no-value.
(In reply to Mike Kaganski from comment #35) > (In reply to Jonny Grant from comment #34) > > It doesn't matter if someone is contributing or not. It does matter if they > are doing something correct or not. Musings in an issue is off-topic. When I > do something unrelated, I mark my own "contributions" appropriately (as this > my reply). I don't mark the spam in comment 34 as spam, though, because of > explicit request. I just mark it no-value. Feels like passive aggressive behaviour.
Fwiw, this bug was about the conversion from string to double, which was fixed with the commits mentioned. All later comments are completely off-topic and hijacking this bug for unrelated discussion. This helps nothing and no one. Au contraire, repeating those over and over again at several unrelated places and interspersing information all over the place so one can't even follow things is annoying. Please stop it. Thank you. Jonny's is likely bug 138920 for the double to string conversion on which I'm working since a while now, trying to get around quirks quirk by quirk.