Bug 128503 - Document the "numeric first" behaviour of Criteria in COUNTIF() and the like (see comment 14 ff.)
Summary: Document the "numeric first" behaviour of Criteria in COUNTIF() and the like ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:6.4.0
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-31 13:03 UTC by David Lynch
Modified: 2019-11-16 09:48 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Incorrect results from regular expresions from COUNTIF (14.24 KB, application/vnd.oasis.opendocument.spreadsheet)
2019-10-31 13:04 UTC, David Lynch
Details
Screen shot showing bug (20.01 KB, image/gif)
2019-11-01 08:39 UTC, David Lynch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Lynch 2019-10-31 13:03:21 UTC
Description:
With regular expression ealuation enabled, I get incorrect results from COUNTIF but not REGEX with regular expression involving a digit and a .
Note the errors are different if the digit is zero or non-zero. Attachment has details.

Steps to Reproduce:
Attachment has details.

Actual Results:
Attachment has details.

Expected Results:
Attachment has details.


Reproducible: Always


User Profile Reset: Yes



Additional Info:
REGEX gets it right
Comment 1 David Lynch 2019-10-31 13:04:39 UTC
Created attachment 155426 [details]
Incorrect results from regular expresions from COUNTIF
Comment 2 m_a_riosv 2019-10-31 23:24:14 UTC
Please detail at least some of supposed wrong values, COUNTIF does the same as REGEX for me. 

50	1	=COUNTIF(A1;".0")	wrong	50	=REGEX(A1;".0")right
0	0	=COUNTIF(A2;".0")	wrong	#N/A	=REGEX(A2;".0")right
a0	1	=COUNTIF(A3;".0")	wrong	a0	=REGEX(A3;".0")right
57	1	=COUNTIF(A4;".7")	wrong	57	=REGEX(A4;".7")right
57	1	=COUNTIF(A5;".7")	wrong	57	=REGEX(A5;".7")right
7	0	=COUNTIF(A6;".7")	right	#N/A	=REGEX(A6;".7")right
a7	1	=COUNTIF(A7;".7")	wrong	a7	=REGEX(A7;".7")right
7a	1	=COUNTIF(A8;".a")	right	7a	=REGEX(A8;".a")right
ab	1	=COUNTIF(A9;".b")	right	ab	=REGEX(A9;".b")right
50	1	=COUNTIF(A10;"5.")	wrong	50	=REGEX(A10;"5.")right
50	1	=COUNTIF(A11;"5.")	wrong	50	=REGEX(A11;"5.")right
a0	1	=COUNTIF(A12;"a.")	right	a0	=REGEX(A12;"a.")right

Version: 6.3.3.1 (x64)
Build ID: f41f4c7f9507aeca13cb9df51f34d80e8ba30a99
CPU threads: 4; OS: Windows 10.0; UI render: GL; VCL: win; 
Locale: es-ES (es_ES); UI-Language: en-US Calc: 
Version: 6.4.0.0.alpha0+ (x64)
Build ID: 758516295e5f69393bd78bb4af6e7214d48ece0b
CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: GL; VCL: win; 
Locale: es-ES (es_ES); UI-Language: en-US Calc: CL
Comment 3 David Lynch 2019-11-01 08:38:40 UTC
Your results are indeed correct. I attach a screenshot of my results (in safe mode). I get the same results on another (Windows) computer.

Version: 6.3.2.2 (x64)
Build ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-GB (en_GB); UI-Language: en-US
Calc: threaded

I can't think of any more detail to give you. Let me know if you want more.
Comment 4 David Lynch 2019-11-01 08:39:19 UTC
Created attachment 155441 [details]
Screen shot showing bug
Comment 5 Oliver Brinzing 2019-11-01 14:00:10 UTC
with 

Version: 6.3.3.2 (x64)
Build ID: a64200df03143b798afd1ec74a12ab50359878ed
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: 

i have following results:

50	1	=COUNTIF(A1;".0")	wrong	50	=REGEX(A1;".0")	right
0	0	=COUNTIF(A2;".0")	wrong	#N/A	=REGEX(A2;".0")	right
a0	1	=COUNTIF(A3;".0")	wrong	a0	=REGEX(A3;".0")	right
57	1	=COUNTIF(A4;".7")	wrong	57	=REGEX(A4;".7")	right
57	1	=COUNTIF(A5;".7")	wrong	57	=REGEX(A5;".7")	right
7	0	=COUNTIF(A6;".7")	right	#N/A	=REGEX(A6;".7")	right
a7	1	=COUNTIF(A7;".7")	wrong	a7	=REGEX(A7;".7")	right
7a	1	=COUNTIF(A8;".a")	right	7a	=REGEX(A8;".a")	right
ab	1	=COUNTIF(A9;".b")	right	ab	=REGEX(A9;".b")	right
50	1	=COUNTIF(A10;"5.")	wrong	50	=REGEX(A10;"5.")right
50	1	=COUNTIF(A11;"5.")	wrong	50	=REGEX(A11;"5.")right
a0	1	=COUNTIF(A12;"a.")	right	a0	=REGEX(A12;"a.")right
Comment 6 m_a_riosv 2019-11-01 21:15:49 UTC
Please test playing with 'Calc: threaded'  Menu/Tools/Options/LIbreOffice calc/Calculate.
Comment 7 Oliver Brinzing 2019-11-01 21:24:49 UTC
(In reply to m.a.riosv from comment #6)
> Please test playing with 'Calc: threaded'  Menu/Tools/Options/LIbreOffice
> calc/Calculate.

this seems to make no difference, i have always correct results
Comment 8 David Lynch 2019-11-01 21:31:44 UTC
This makes no diffrence I always have wrong results:				
						
no thread						
						
50	0	=COUNTIF(A1,".0")	wrong	50	=REGEX(A1,".0")	right
0	1	=COUNTIF(A2,".0")	wrong	#N/A	=REGEX(A2,".0")	right
a0	0	=COUNTIF(A3,".0")	wrong	a0	=REGEX(A3,".0")	right
57	0	=COUNTIF(A4,".7")	wrong	57	=REGEX(A4,".7")	right
57	0	=COUNTIF(A5,".7")	wrong	57	=REGEX(A5,".7")	right
7	0	=COUNTIF(A6,".7")	right	#N/A	=REGEX(A6,".7")	right
a7	0	=COUNTIF(A7,".7")	wrong	a7	=REGEX(A7,".7")	right
7a	1	=COUNTIF(A8,".a")	right	7a	=REGEX(A8,".a")	right
ab	1	=COUNTIF(A9,".b")	right	ab	=REGEX(A9,".b")	right
50	0	=COUNTIF(A10,"5.")	wrong	50	=REGEX(A10,"5.")	right
50	0	=COUNTIF(A11,"5.")	wrong	50	=REGEX(A11,"5.")	right
a0	1	=COUNTIF(A12,"a.")	right	a0	=REGEX(A12,"a.")	right
						
threaded						
50	0	=COUNTIF(A1,".0")	wrong	50	=REGEX(A1,".0")	right
0	1	=COUNTIF(A2,".0")	wrong	#N/A	=REGEX(A2,".0")	right
a0	0	=COUNTIF(A3,".0")	wrong	a0	=REGEX(A3,".0")	right
57	0	=COUNTIF(A4,".7")	wrong	57	=REGEX(A4,".7")	right
57	0	=COUNTIF(A5,".7")	wrong	57	=REGEX(A5,".7")	right
7	0	=COUNTIF(A6,".7")	right	#N/A	=REGEX(A6,".7")	right
a7	0	=COUNTIF(A7,".7")	wrong	a7	=REGEX(A7,".7")	right
7a	1	=COUNTIF(A8,".a")	right	7a	=REGEX(A8,".a")	right
ab	1	=COUNTIF(A9,".b")	right	ab	=REGEX(A9,".b")	right
50	0	=COUNTIF(A10,"5.")	wrong	50	=REGEX(A10,"5.")	right
50	0	=COUNTIF(A11,"5.")	wrong	50	=REGEX(A11,"5.")	right
a0	1	=COUNTIF(A12,"a.")	right	a0	=REGEX(A12,"a.")	right
Comment 9 m_a_riosv 2019-11-01 23:35:13 UTC
Well I'm not able  to figure out what can be, no changes in calculation or formula options for calc makes any difference.
Comment 10 Oliver Brinzing 2019-11-02 07:56:24 UTC
(In reply to m.a.riosv from comment #9)
> Well I'm not able  to figure out what can be, no changes in calculation or
> formula options for calc makes any difference.

It's the "Locale setting" (Menu "Tools/Options.../Language Settings/Languages")

with:

Version: 6.4.0.0.alpha1+ (x64)
Build ID: 885ae558d34dd76955c727b90eb9ae52ce85df7f
CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: default; VCL: win; 
Locale: en-US (de_DE); UI-Language: en-US
Calc: 

i have wrong results too...
Comment 11 Oliver Brinzing 2019-11-02 08:04:32 UTC
already reproducible with:
Version 3.6.7.2 (Build ID: e183d5b) and AOO 4.1.5
Comment 12 David Lynch 2019-11-02 12:24:47 UTC
Confirmed. If I set my locale to Spanish(Spain), the answers are correct.

Version: 6.3.2.2 (x64)
Build ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: es-ES (en_GB); UI-Language: en-US
Calc: threaded
Comment 13 m_a_riosv 2019-11-02 19:29:34 UTC
Looks it works fine only with languages having comma as decimal separator.

Maybe is there something else affecting the issue, I can't believe it wasn't reported.

There is also a bug https://bugs.documentfoundation.org/show_bug.cgi?id=93510, about 'Regular expressions do not works when enabled automatically find columns and rows labels', but changed that option doesn't affect this one.

Hi @Eike can you take a look, seems critical for the affected people, SUMIF and SUMIFS are also affected.
Comment 14 Eike Rathke 2019-11-04 13:45:35 UTC
Hard stuff. Though IMHO there is nothing wrong. In general, COUNTIF() and others for a string criterion follow the setting Tools -> Options -> Calc -> Calculate, General Calculations, either wildcards, regular expressions, or none.

However, the first attempt is to convert the string criterion to number, ".0" => 0.0 and so on, and if successful the match will not be a regular expression match but a numeric match. Which when switching to a locale where the decimal separator is not dot makes it suddenly work. To force regex instead of numeric use some expression that can not be "misread" as numeric, for example ".[0]" or ".\0" or "(?i).0" (for case insensitive match of the entire expression, which the default is anyway).

There's not much we can do about this "numeric first" behaviour, it is expected. And yes, in the different locales context it is awkward. Blame Excel.
Comment 15 David Lynch 2019-11-04 18:27:03 UTC
If it needs to be like this, could the workaround be put in the documentation please?
Comment 16 Eike Rathke 2019-11-05 13:15:35 UTC
Probably a good idea..

Btw, in these test cases, as they want to match from the beginning, the best (i.e. simplest and fastest) forcing to regex would be using the ^ anchor, so "^.0"
Comment 17 Olivier Hallot 2019-11-06 10:52:42 UTC
Help patch submitted

https://gerrit.libreoffice.org/#/c/82094/
Comment 18 Commit Notification 2019-11-08 18:41:48 UTC
Olivier Hallot committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/6f107ecda99e339d61517de33c0d452c18dfac3e

tdf#128503 Clarify numeric x regex conversion