Bug 168673 - COUNTIFS() returns incorrect count with decimal numbers when matching on last item of area
Summary: COUNTIFS() returns incorrect count with decimal numbers when matching on last...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
25.8.1.1 release
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:26.2.0
Keywords:
Depends on:
Blocks:
 
Reported: 2025-10-03 06:21 UTC by Dominik Stadler
Modified: 2025-10-09 19:25 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Spreadsheet which shows the problem (25.26 KB, application/vnd.oasis.opendocument.spreadsheet)
2025-10-03 06:21 UTC, Dominik Stadler
Details
Update reproducer showing which cases result in incorrect "0" results from COUNTIFS (25.57 KB, application/vnd.oasis.opendocument.spreadsheet)
2025-10-03 11:18 UTC, Dominik Stadler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dominik Stadler 2025-10-03 06:21:35 UTC
Created attachment 203123 [details]
Spreadsheet which shows the problem

When building a spreadsheet which uses "COUNTIFS()" on areas with cells of type time or decimal numbers, COUNTIFS() misses to count an item in some cases.

See attached file for a reproducer with explanations.

In short, if at least the following are true, a matching cell is not counted correctly:
* The cells contain time-values or decimal numbers
* COUNTIFS() contains a "=" condition
* The condition points to the last cell of the area

Expected results: 1, as the compared cell itself should be matched
Seen results: 0

When using a slightly larger area, it works again. Having the item as last in the area is required for the issue to be visible.



Tested on the following three versions:


Version: 24.8.7.2 (X86_64) / LibreOffice Community
Build ID: 480(Build:2)
CPU threads: 8; OS: Linux 6.8; UI render: default; VCL: x11
Locale: de-AT (C); UI: en-US
Ubuntu package version: 4:24.8.7-0ubuntu0.24.10.1~bpo24.04.1
Calc: threaded

and 

Version: 25.2.3.2 (X86_64) / LibreOffice Community
Build ID: 520(Build:2)
CPU threads: 2; OS: Linux 6.12; UI render: default; VCL: x11
Locale: de-AT (de_AT.UTF-8); UI: en-US
Debian package version: 4:25.2.3-2+deb13u2
Calc: threaded

and 

Version: 25.8.1.1 (X86_64) / LibreOffice Community
Build ID: 580(Build:1)
CPU threads: 2; OS: Linux 6.12; UI render: default; VCL: x11
Locale: en-US (C); UI: en-US
Debian package version: 4:25.8.1-1~bpo13+1
Calc: threaded
Comment 1 Mike Kaganski 2025-10-03 07:11:51 UTC
In Version: 25.8.2.2 (X86_64)
Build ID: d401f2107ccab8f924a8e2df40f573aab7605b6f
CPU threads: 24; OS: Windows 11 X86_64 (build 26200); UI render: Skia/Vulkan; VCL: win
Locale: de-DE (en_US); UI: en-US
Calc: CL threaded

the "Using floating point fails" section works for me (shows 1, as expected).

For the "Using floating point with Time-Format fails" section, it indeed shows 0. However:

1. I can't understand what "Seems to happen if the item is at the end, with a slightly larger area, it works" means specifically. I tried to change the formula in C11 to =COUNTIFS($A$7:$A$12;"="&$A11), and it fails likewise. The formula in C12 is comparing an empty cell to an empty cell, so is unrelated.

2. It is not actually about "being at the end" - it seems specific to the concrete time value of '01:26,47' - which fails, wherever in the range it is (e.g., extending the formula in C11 to C7:C11, and copying the value of '01:26,47' to A9, makes C9 also show 0).

3. Even changing number format of A11 to Standard doesn't change the recalculated result of C11.

4. It seems to be very unusual. The following formula also shows FALSE:

=A11=VALUE(TEXT(A11;"Standard"))

and the following formula shows -2,31481481481481E-14:

=A11-VALUE(TEXT(A11;"Standard"))

Note that formatting A11 as Standard shows this number:

0,00100081018518519

and a formula like =TEXT(A11;"Standard") shows this text:

0,00100081018518519

both look identical, and - what's important - have 17 digits after decimal; so the reported difference of 2E-14 is completely unexpected.
Comment 2 Mike Kaganski 2025-10-03 07:22:53 UTC
(In reply to Mike Kaganski from comment #1)
> =A11=VALUE(TEXT(A11;"Standard"))

A note: the "Standard" string is unfortunately locale-specific; it may need to be "General", or whatever the format is called in the specific locale.
Comment 3 Mike Kaganski 2025-10-03 07:58:28 UTC
Hmm.

The problem with inequality itself seems NOTABUG, and related to the usual floating-point calculation imprecision [1]. Specifically:

1. Entering '01:26,47' time converts to 0,0010008101851851851;
2. In accordance to widely accepted policy of spreadsheets, when showing the value, only 15 significands are shown: 0,00100081018518519;
3. The same number of significands is used to convert to strings, so the '"="&A11' produces a Criterion "=0,00100081018518519";
4. When converting the "0,00100081018518519" to a number, the result is 0.0010008101851851901;
5. the comparison of these values in steps 1 and 4 produces false, even with our normal approximate equality (the difference is ~5E-18, which is greater than ~1E-18, which is the threshold used for approximate equality, calculated as the original values * 1E-15).

I am not sure if we can improve this specific corner case - the usual "for any floating-point calculations, use rounding" seems not fitting this well, but I don't see a robust way out here. Increase threshold to original value * 5E-15 ? I know for sure, that I tested 1E-14, and it was problematic; but I didn't test 5E-15.

===

However, the problem with subtraction, noticed in comment 1, is separate and different. The formula

=A11-VALUE(TEXT(A11;"Standard"))

indeed produces -2,31481481481481E-14; whereas the formula

=RAWSUBTRACT(A1;VALUE(TEXT(A1;"General")))

produces the expected -4,98732999343332E-18.

Needs more investigation.

[1] https://wiki.documentfoundation.org/Faq/Calc/Accuracy
Comment 4 Mike Kaganski 2025-10-03 08:16:56 UTC
The difference problem is likely related to the changes introduced in tdf#127334. Indeed, in v.7.6, the difference gives the expected ~5E-18, while in v.24.2, it is ~2E-14.

A note: to avoid the locale-specific nature of TEXT function, the testing formulas can be

=A1-VALUE(CONCATENATE(A1))
=VALUE(A1)-VALUE(CONCATENATE(A1))
=RAWSUBTRACT(A1;VALUE(CONCATENATE(A1)))
Comment 5 Mike Kaganski 2025-10-03 08:29:35 UTC
(In reply to Mike Kaganski from comment #3)
> 5. the comparison of these values in steps 1 and 4 produces false, even with
> our normal approximate equality (the difference is ~5E-18, which is greater
> than ~1E-18, which is the threshold used for approximate equality,
> calculated as the original values * 1E-15).
> 
> I am not sure if we can improve this specific corner case - the usual "for
> any floating-point calculations, use rounding" seems not fitting this well,
> but I don't see a robust way out here. Increase threshold to original value
> * 5E-15 ? I know for sure, that I tested 1E-14, and it was problematic; but
> I didn't test 5E-15.

FTR: the threshold is calculated using 0x1p-48 [1], which is ~3.5E-15; so it is greater than 1E-15, but less than 5E-15, which would be necessary for this case.

Excel 2016 gives the expected 0 for the difference.

[1] https://opengrok.libreoffice.org/xref/core/sal/rtl/math.cxx?r=c5c8e28959cc379452b15aa250ca7151a9e3d370&mo=23757&fi=604#604
Comment 6 Dominik Stadler 2025-10-03 11:18:02 UTC
Created attachment 203125 [details]
Update reproducer showing which cases result in incorrect "0" results from COUNTIFS

Thanks for the discussion, in the reproducer, the reproducer was slightly wrong, the updated content now comes out as

Time-Formatted:
0	=COUNTIFS($A$7:$A$11;"="&$A11)
0	'=COUNTIFS($A$7:$A$12;"="&$A11)

Decimal-Formatted:
0	'=COUNTIFS($A$17:$A$21;"="&$A21)
1	'=COUNTIFS($A$17:$A$22;"="&$A21)


Whereas processing in Excel results in "1" for all of them.


Having an "equals" fail due to precision is strange, are there two different ways of transforming the data to string before comparing?


I have updated the attachment with the updated content.
Comment 7 Mike Kaganski 2025-10-03 13:19:24 UTC
(In reply to Dominik Stadler from comment #6)
> Thanks for the discussion, in the reproducer, the reproducer was slightly
> wrong, the updated content now comes out as
> 
> Time-Formatted:
> 0	=COUNTIFS($A$7:$A$11;"="&$A11)
> 0	'=COUNTIFS($A$7:$A$12;"="&$A11)
> 
> Decimal-Formatted:
> 0	'=COUNTIFS($A$17:$A$21;"="&$A21)
> 1	'=COUNTIFS($A$17:$A$22;"="&$A21)
> 
> 
> Whereas processing in Excel results in "1" for all of them.

I still see
 Decimal-Formatted:
 1	'=COUNTIFS($A$17:$A$21;"="&$A21)
 1	'=COUNTIFS($A$17:$A$22;"="&$A21)

and that may be because when you copy the date from A11, and paste to A21, it keeps the exact value; but when it goes through save-and-reload, it is rounded to 0,00100081018518519.

> Having an "equals" fail due to precision is strange, are there two different
> ways of transforming the data to string before comparing?

Not quite understand the question. Which two ways are meant? We only discussed one way of transforming the data to string.
Comment 8 Dominik Stadler 2025-10-03 14:00:49 UTC
Yes, saving, editing or changing the precision sometimes affects this, although a simple safe-and-reload did not change it for me, adjusting precision of the cell-formats did. For time-format I did not find a way to make it work.

With "two ways" I meant that the cell is compared with itself as part of COUNTIFS() and somehow this comparison ends up being "not equal", which indicates that the value is handled in two different ways, otherwise "precision" or "conversion to text" couldn't have an impact.
Comment 9 Mike Kaganski 2025-10-03 14:19:13 UTC
(In reply to Dominik Stadler from comment #8)
> With "two ways" I meant that the cell is compared with itself as part of
> COUNTIFS() and somehow this comparison ends up being "not equal", which
> indicates that the value is handled in two different ways, otherwise
> "precision" or "conversion to text" couldn't have an impact.

When you use a comparison like '=A11=A10', the comparison doesn't use any conversion to strings. But when you use COUNTIFS with a Criterion like "="&A11, that criterion is a string - and A11 gets converted to string automatically there. The resulting criterion is a string "=0,00100081018518519", which has a string which is *not* a precise representation of the value in A11, but rounded to 15 significands. The *precise* criterion would be "=0,001000810185185185", but there is no method in Calc to convert the number to string with that precision. Then the criterion is evaluated, and the string "0,00100081018518519" gets converted to number again - to perform comparison to the actual cell value (numerical). Since cell A11 contains 0,001000810185185185, and the string "0,00100081018518519" gets converted to number 0,00100081018518519 - their comparison gives false.

Of course, in this specific case, it's unreasonable to use the string as criterion: you could rewrite your formula to be simply

  =COUNTIFS($A$7:$A$12; $A11)

and avoid the problem completely: if you only need an equality, it's better, more robust, and also faster to avoid that "convert to string, then convert back to number" dance. But if you would need a criterion like "<>"&$A11, that problem would appear again, so that is not a solution to this bug.
Comment 10 Dominik Stadler 2025-10-03 16:07:14 UTC
Thanks for the workaround of avoiding strings for equals-condition, it helps in one case. 

But in another place we are using "<" and as expected that has similar issues:

01:21,44	1	=COUNTIFS($A$1:$A$15;"<"&$A1)	
01:28,99	4	=COUNTIFS($A$1:$A$15;"<"&$A2)	
01:03,19	0	=COUNTIFS($A$1:$A$15;"<"&$A3)	
01:27,99	3	=COUNTIFS($A$1:$A$15;"<"&$A4)	<= Expecting „2“, but getting "3"!
01:26,47	3	=COUNTIFS($A$1:$A$15;"<"&$A5)
Comment 11 Mike Kaganski 2025-10-05 10:56:07 UTC
https://gerrit.libreoffice.org/c/core/+/191883
Comment 12 Commit Notification 2025-10-05 12:16:17 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d32c664c3df302b7107a70858aa55b11601225cf

tdf#168673: approx equality threshold no less than 1/2 15th significand

It will be available in 26.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Dominik Stadler 2025-10-07 19:16:09 UTC
I verified via the daily Linux-rpm_deb-x86_64@tb99-TDF2025-10-06_17.27.27/ that the fix works. After editing the formulas in the reproducer-file, the computation is now correct in all cases.

Thanks for the quick fix!
Comment 14 Albrecht Müller 2025-10-09 15:01:56 UTC
Unfortunately I cannot test the fix but I played a little bit with the example file using Version: 24.2.7.2 (X86_64) / LibreOffice Community
Build ID: ee3885777aa7032db5a9b65deec9457448a91162
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: de-DE
Calc: threaded

As a result I think I discovered three independent problems: Some strange behavior when loading the document, a problem with the default representation of numbers and one with the specification of the equality operator.

When I open Reproducer_LibreOffice_COUNTIF_Equals_FloatingPoint.ods then Cell C21 shows 0 and C22 shows 1. When I select Cell C21 and press F9 (Recalc) then the correct value 1 is shown. This behavior has nothing to do with any properties of the equality operator and this is probably not fixed.

For every floating point number (except a few special cases such as NaNs) there is a string representation that allows to reconstruct this number exactly. LibreOffice seems not use this kind of representation by default. As a result you may loose information whenever you use a round trip from floating point to string representation back to floating point. This may affect operations such as storing and reloading documents.

As the fix affects some very basic functionality I had a look into the specification of the equality operator. Section 6.4.7 Infix Operator "="  [1] says:

“Evaluators may approximate and test equality of two numeric values with an accuracy of the magnitude of the given values scaled by the number of available bits in the mantissa, ignoring some least significant bits and thus providing compensation for not exactly representable values.”

A few questions came into my mind:

1) Does the fix conform to this specification? The specification allows to ignore a few bits but does not mention the use of more complex algorithms. On the other hand the specification seems not to require that the “=”-Operator has the mathematical properties of an equivalence relation.

2) Does the fix guarantee the transitivity requirement of equivalence relations? Can you exclude the possibility that three floating point numbers A, B and C exist such that the implemented algorithm says A is equal to B, and B is equal to C but A is not equal to C? 

3) Is the reported behavior a bug? Maybe there are expectations that Calc as it is specified cannot meet. The specification allows to ignore some least significant bits. How many? That is left to the implementation. This means that the behavior of the equality operator and thus the behavior of many other functions that depend on it may differ between implementations. If you want reliable results you probably should not use the equality operator or any functionality that depends on it. But what is the alternative?

4) What happens when you have two floating point numbers A and B that differ only in the bits that the equality operator chooses to ignore? Does this imply you have A = B but A - B is not equal to zero?

5) How are the “<” and “>” operators related to equality? I did not find any consistency requirements in the specification of these operators (Section 7.4.9, below the specification of the Operator “=”). Thus maybe these operators use all bits in the mantissa. A consequence would be: If two number A and B differ only in the ignored bits you may have A < B and also A = B.

6) Is NOT (A < B) the same as A >= B? Maybe not. The first expression may compare all mantissa bits but the second may ignore some least significant bits. And of course this behavior may be implementation dependent.

So I think the comparison operators are specified to show some pretty strange behavior that you cannot fix by improving the implementation. And I did not even touch the peculiarities of date and time arithmetic… Thus additional questions surface:

How do you explain these things to the user, e.g. in the help information? What are the recommendations for getting reliable and reproducible results that you can exchange with other tools such as Excel?
and 
Why not simply use a floating point arithmetic that conforms to the IEEE 754 – Standard as probably most programming languages do?

------
[1] https://oasis-tcs.github.io/odf-tc/odf1.4/part4-formula/OpenDocument-v1.4-part4-formula.html#__RefHeading__1018024_715980110
Comment 15 Mike Kaganski 2025-10-09 15:10:51 UTC
(In reply to Albrecht Müller from comment #14)
> 1) Does the fix conform to this specification?

Yes.

> 2) Does the fix guarantee the transitivity requirement of equivalence
> relations?

No.

> 3) Is the reported behavior a bug?

Yes.

> 4) What happens when you have two floating point numbers A and B that differ
> only in the bits that the equality operator chooses to ignore? Does this
> imply you have A = B but A - B is not equal to zero?

No. But if you replace "-" with "RAWSUBTRACT", then yes.
Comment 16 Albrecht Müller 2025-10-09 15:52:20 UTC
(In reply to Mike Kaganski from comment #15)
> No. But if you replace "-" with "RAWSUBTRACT", then yes.

Thanks for pointing to RAWSUBTRACT. Are there other functions that offer IEEE 754 conforming functionality such as equality and comparison operators?
Comment 17 Albrecht Müller 2025-10-09 18:39:13 UTC
(In reply to Mike Kaganski from comment #15)

> > 2) Does the fix guarantee the transitivity requirement of equivalence
> > relations?
> 
> No.
> 

Expected behavior: “=” is transitive
Actual behavior: “=” is not transitive

A new bug?
Comment 18 Mike Kaganski 2025-10-09 18:49:32 UTC
(In reply to Albrecht Müller from comment #17)
> Expected behavior: “=” is transitive

The expectation is wrong.
Comment 19 Albrecht Müller 2025-10-09 19:18:20 UTC
(In reply to Mike Kaganski from comment #18)
> > Expected behavior: “=” is transitive
> 
> The expectation is wrong.


From https://en.wikipedia.org/wiki/Equals_sign:

“The equals sign (British English) or equal sign (American English), also known as the equality sign, is the mathematical symbol =, which is used to indicate equality.” 

From https://en.wikipedia.org/wiki/Equality_(mathematics):

“Basic properties about equality like reflexivity, symmetry, and transitivity have been understood intuitively since at least the ancient Greeks, but were not symbolically stated as general properties of relations until the late 19th century by Giuseppe Peano.”
Comment 20 Mike Kaganski 2025-10-09 19:25:00 UTC
Please:

1. Realize that Calc is not an algebra system;
2. Realize that "Equality" and "Equality (mathematics)" is different; and even that "mathematics" is broad, including very different notions of "equality", including non-transitive variants;
3. Stop spamming in this issue with unrelated posts. The situation you try to discuss here is not related to the issue here, and in not new. If needed, file a separate bug.