Bug 149950 - Long date (d mmm yyyy) in en-ZA locale not recognized.
Summary: Long date (d mmm yyyy) in en-ZA locale not recognized.
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: framework (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:7.5.0 target:7.4.0.2
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-11 13:39 UTC by roland
Modified: 2022-07-14 08:35 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description roland 2022-07-11 13:39:15 UTC
The datevalue function does not honour the data formats set in the Date Acceptance Patterns setting.

This is best explained by show the following examples:

datevalue("02 Mar 2020) returns "Err:502"
datevalue("02 03 2020") returns "43892", which is correct.

From the above examples, it appears as if there is only a problem with recognising the month name abbreviation, but there's more to it.

My language setting is "English (South Africa)" and the default data recognition strings are "Y-M-D;Y/M/D;M-D;M/D".  This does not include "D M Y", but LibreOffice seems to recognise the format anyway.  This is nonsensical, since the patterns are explicitly showing what will be recognised and not recognising other formats not listed. I expect to have to add "M D Y" to that string to make it "Y-M-D;Y/M/D;M-D;M/D;D M Y".

As an aside: I do understand that the official format for South Africa is the default ""Y-M-D;Y/M/D;M-D;M/D", but that is not what is actually being used. "dd mmm yyyy" is probably the most used format here.  By comparison, the US is supposed to us metric measurement (officially), but they don't, and LibreOffice honours that.  There doesn't seem to be a way for a user to fix the recognition strings.

If I change the language to English (USA) or English (UK), it makes no difference.  The date recognition patterns are even less and still "D M Y" is recognised and "Mar" not. Actually no month name is recognised.  

This is not correct and it seriously affects usability as well as compatibility with other formats like Excel for example.  In Excel the above is interpreted correctly, but when a sheet is imported into LO abd the values recalculated, it gives the error 502.
Comment 1 Mike Kaganski 2022-07-11 13:55:36 UTC
I believe that date acceptance pattern is not meant to affect function results. (I may be wrong; I hope that Eike corrects me.) If I am correct, then the only purpose of the date acceptance pattern is just to allow date recognition at the user input - e.g., in a cell, so that strings that are *easy* for user to type would turn into proper dates (and their look would change after recognition into standard or explicitly specified cell format). But result of Calc's =DATEVALUE("..."), as well as Basic DateValue function, should not depend on such things.
Comment 2 roland 2022-07-11 14:51:21 UTC
From what you are indicating, it then seems that datevalue() should know all data formats.  Is that realistic?  Maybe it's time then to integrate the datevalue() code with the code that does the validation of input? Or to let them both use the same logic at least.  

After digging through the many posts about problems with datevalue() that search engines return and apparet lack of understanding for dates in general in LO in many of these, this would probably be a sensible move?

I seems logical to me that having a user editable list of formats that datavalue() would recognise, should be based on a couple of universal formats (as it is now) and furthermore on whatever a user specifies in the Date Acceptance Patterns settings.  

For now, at least an update to the documentation about this would help to clarify the current state of affairs.

(I wish I were a proficient dev, then I would make these changes myself and submit them :-) )
Comment 3 Eike Rathke 2022-07-12 20:42:12 UTC
(In reply to Mike Kaganski from comment #1)
> I believe that date acceptance pattern is not meant to affect function
> results.
Date acceptance patterns do affect DATEVALUE(), as the argument is just passed to the number formatter/scanner's IsNumberFormat() with all its quirks and bells and whistles, that parses it according to the current locale and settings.


(In reply to roland from comment #0)
> datevalue("02 Mar 2020) returns "Err:502"
> datevalue("02 03 2020") returns "43892", which is correct.
I can not reproduce. I tried in an en-ZA locale with its default date acceptance patterns
Y-M-D;Y/M/D;M-D;M/D
and both formula expressions return Err:502 and the same strings as cell inputs are also not accepted as dates but stay text.
What LibreOffice version are you using anyway?


(In reply to roland from comment #2)
> Maybe it's time then to integrate the
> datevalue() code with the code that does the validation of input? Or to let
> them both use the same logic at least.  
It does exactly that.
Comment 4 Eike Rathke 2022-07-12 22:22:52 UTC
However, I think that "02 Mar 2020" should be accepted because it matches the DMY order of the locale's default long date format
NNNND MMMM YYYY
Comment 5 roland 2022-07-13 06:47:10 UTC
(In reply to Eike Rathke from comment #3)
> (In reply to roland from comment #0)
> > datevalue("02 Mar 2020) returns "Err:502"
> > datevalue("02 03 2020") returns "43892", which is correct.
> I can not reproduce. I tried in an en-ZA locale with its default date
> acceptance patterns
> Y-M-D;Y/M/D;M-D;M/D
> and both formula expressions return Err:502 and the same strings as cell
> inputs are also not accepted as dates but stay text.

But I add "D M Y" to the acceptance patterns so that is does recognise
it.  Without that datevalue("02 03 2020") return Err:502, but with it,
it returns the date value 43892.

The problem is that 2 Mar 2020 is a proper date and it should be
recognised as such.

If I switch to en-US, then "Mar 2, 2020" is recognised.  Therefor
under en-ZA "2 Mar 2020" should be recognised too, but it isn't,
regardless of the acceptance pattern added.

> What LibreOffice version are you using anyway?
Version: 7.3.4.2 / LibreOffice Community
Build ID: 30(Build:2)
CPU threads: 12; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-US (en_ZA.UTF-8); UI: en-US
Ubuntu package version: 1:7.3.4~rc2-0ubuntu0.20.04.1~lo1
Comment 6 Eike Rathke 2022-07-13 09:51:54 UTC
In the description you wrote
> My language setting is "English (South Africa)" and the default data recognition strings are "Y-M-D;Y/M/D;M-D;M/D".  This does not include "D M Y", but LibreOffice seems to recognise the format anyway.

which I could not confirm. With 'D M Y' added of course "02 03 2020" is recognized.

Anyway, the date acceptance patterns determine which numeric date inputs are to be accepted and adding 'D M Y' should not be necessary. Long date inputs with month names have additional logic respecting the locale's long date separators and spaces. That so far works but then feeding the detected values to the calendar goes wrong because the long date DMY order does not match the numeric date YMD order and determining the difference apparently has a bug, so the calendar is fed with year:=2, month:=3, day:=2020; which of course is invalid.

This is no problem in the en-US locale because both, numeric and long date, have a MDY order, or other locales that for long date and numeric date have the same DMY or YMD order.
Comment 7 Commit Notification 2022-07-13 15:48:07 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d818c341206895a6dda1c19fc8b32f04b5b7c520

Resolves: tdf#149950 Handle LongDateOrder vs DateOrder for middle month name

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Eike Rathke 2022-07-13 15:49:49 UTC
Pending review https://gerrit.libreoffice.org/c/core/+/136990 for 7-4
Comment 9 Commit Notification 2022-07-13 18:40:53 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/126da357f4e9683073cca0b0794398ad7914e483

tdf#149950: svl_qa_cppunit: Add unittest

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2022-07-14 08:35:12 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/90c4d2fe555a52b7bcbda063684ae712207e9b4d

Resolves: tdf#149950 Handle LongDateOrder vs DateOrder for middle month name

It will be available in 7.4.0.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.