Bug 132106 - CELL() function "format" should include more date formats to return "D1" (or a new "D0")
Summary: CELL() function "format" should include more date formats to return "D1" (or ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.0.0
Keywords:
Depends on:
Blocks: Calc-Function
  Show dependency treegraph
 
Reported: 2020-04-14 19:39 UTC by Eike Rathke
Modified: 2020-08-17 10:20 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample with CELL("format") (9.50 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2020-04-15 05:15 UTC, Mike Kaganski
Details
A sample with custom format in A1 (9.05 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2020-04-18 17:12 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eike Rathke 2020-04-14 19:39:18 UTC
The CELL() function for argument "format" lacks at least the ISO 8601 YYYY-MM-DD date format to return "D1" (and for other formats including day, month and year format codes).

Probably YYYY-MM-DD HH:MM:SS and other full date+time formats should return "D4", though there is no "and similar formats" mentioned like for D1 in the ODFF spec.

See https://help.libreoffice.org/6.4/en-GB/text/scalc/01/04060104.html?DbPAR=CALC#bm_id3155509 and https://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part2.html#CELL
Comment 1 Eike Rathke 2020-04-14 19:55:42 UTC
@Mike:
What does Excel return for YYYY-MM-DD and YYYY-MM-DD HH:MM:SS for example?
Comment 2 Mike Kaganski 2020-04-15 05:15:06 UTC
Created attachment 159569 [details]
Sample with CELL("format")

This gives "G" in both cases in Excel 2016.
Comment 3 Eike Rathke 2020-04-15 12:01:49 UTC
Question remains, do we continue to follow that nonsense? I suggest at least for the clear cases of ISO 8601 date and date+time we do not.

The current implementation may vary anyway between locales because it does not use the format code but the semantic index of the format for predefined formats, which most times is correct but in the cases of MM-DD vs DD-MM may not. That could be improved as well.
Comment 4 Mike Kaganski 2020-04-15 12:38:25 UTC
I thought that D1 is for three-element dates with MDY order, D2 for DM, D3 for MY, etc. My perception was based on D2 vs D5, but I see that I must be wrong, because three-part dates then would only have one order (D1).

I don't know which D category is the correct one for ISO dates, but definitely G is not right here.
Comment 5 Eike Rathke 2020-04-16 16:04:54 UTC
What an utter mess..

So, ODFF in https://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part2.html#CELL tried to follow https://support.office.com/en-US/article/CELL-function-51BD39A5-F338-4DBE-A33F-955D67C2B2CF (or an older version thereof) but got one example for D1 wrong, namely MM-D-YY should not be there and the "and similar formats" is misleading in that context.

Inspecting the MS-Excel examples (d-mmm-yy or dd-mmm-yy) I deduce that only date formats containing all three DMY and using an abbreviated month name should return D1. The long month names MMMM are nowhere mentioned. Maybe it's also even only for those en-US D-MMM-YY formats, which are contrary to the "normal" en-US MDY order, as the examples (m/d/yy or m/d/yy h:mm or mm/dd/yy) for D4 are all en-US centric. I tried to compare with the German page https://support.office.com/de-de/article/zelle-funktion-51bd39a5-f338-4dbe-a33f-955d67c2b2cf but that stupidly lists the same formats just with the localized format code keywords.

Seeing D4 (m/d/yy or m/d/yy h:mm or mm/dd/yy) I'd say that *all* formats using all three DMY numeric in any order with or without time should be D4, so also YYYY-MM-DD or YYYY-MM-DD HH:MM and the like.
Comment 6 Eike Rathke 2020-04-16 16:20:02 UTC
@Mike:
Can you please try what Excel returns for your usual Russian date and date+time formats, also with month names (DD-MMM-YY, DD-MMM-YYYY), and maybe long date formats (DD MMMM YYYY or similar)?
Comment 7 Mike Kaganski 2020-04-16 21:29:25 UTC
Д is for D, М for M, Г for Y, ч for h, м for m; decimal separator is comma

ДД.ММ.ГГГГ ч:мм => D1
ДД.ММ.ГГГГ => D1
ДД.МММ.ГГ => D1
Д.М.ГГ;@ => D1
ДД.ММ.ГГ;@ => D1
[$-ru-RU]ММММ ГГГГ;@ => D1
ДД.ММ.ГГ ч:мм;@ => D1
Д.М.ГГГГ;@ => D1
ДД/ММ/ГГ => D1
ДД/МММ/ГГ => D1
ДД/МММ/ГГГГ => D1
ДД-МММ-ГГ => D1
ДД-МММ-ГГГГ => D1
ДД ММММ ГГГГ => D1

ДД.МММ => D2
Д.М;@ => D2
[$-ru-RU]ММММ;@ => D2

МММ.ГГ => D3

МММ-ДД/ГГГГ => D4
М-Д/ГГ => D4

ч:мм:сс AM/PM => D6

ч:мм AM/PM => D7

ч:мм:сс => D8
ч:мм:сс;@ => D8

ч:мм => D9
ч:мм;@ => D9

ГГГГ-ММ-ДД;@ => G
ГГ-М-Д => G
[$-x-sysdate]ДДДД, ММММ ДД, ГГГГ => G
[$-ru-RU]Д МММ;@ => G
[$-ru-RU]Д МММ ГГ;@ => G
[$-ru-RU]ДД МММ ГГ;@ => G
[$-ru-RU-x-nomlower]ГГГГ, ММММ;@ => G
[$-ru-RU-x-genlower]ГГГГ, ДД ММММ;@ => G
[$-ru-RU-x-genlower]Д ММММ ГГГГ "г." => G
[$-ru-RU-x-genlower]ДД ММММ ГГГГ \г\.;@ => G
[$-ru-RU]Д-МММ-ГГГГ;@ => G
[$-en-US]ДД.ММ.ГГ ч:мм AM/PM;@ => G
[$-x-systime]ч:мм:сс AM/PM => G
[$-en-US]ч:мм AM/PM;@ => G
[$-en-US]ч:мм:сс AM/PM;@ => G
мм:сс => G
мм:сс,0 => G
мм:сс,0;@ => G
[ч]:мм:сс => G
[ч]:мм:сс;@ => G
Comment 8 Mike Kaganski 2020-04-16 21:31:47 UTC
М-Д => D5
Comment 9 Eike Rathke 2020-04-17 17:38:21 UTC
Thanks a lot!
So the Excel documentation is incomplete and it's more like ODFF says for D1 (with example and "or similar") and what we implemented so far, but still the differentiation between D1 and D4 is somewhat a mystery to me. Maybe D4 is *only* for formats where M precedes D in a MDY date(+time) format, and D1 is *only* for formats where D precedes M in DMY. That would make sense with your list. And the D4 example would be wrong in the ODFF. And leaves no choice for YMD.

Maybe we should just introduce D0 for the YMD order (D10 isn't a good choice as it is 3 characters instead of the other 2), and rework using the simplified predefined index returns. Actually the D1 vs D4 (and then D0) and other Dx returns could be used to determine how to parse a date display string if those actually follow the DMY vs MDY vs YMD order.

I'll try to take a stab at this.

Btw, it's news to me that Excel allows these [$-x-...] and [$-ru-RU] and [$-ru-RU-x-...] language tag modifiers. Are they saved the same to document storage?
Comment 10 Commit Notification 2020-04-18 01:09:40 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/dbda099d198a2236bafea2209cb5e3fc58ee8741

Move implementation of CELL("format";...) to SvNumberFormatter, tdf#132106 prep

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Mike Kaganski 2020-04-18 17:12:01 UTC
Created attachment 159686 [details]
A sample with custom format in A1

(In reply to Eike Rathke from comment #9)
> Btw, it's news to me that Excel allows these [$-x-...] and [$-ru-RU] and
> [$-ru-RU-x-...] language tag modifiers. Are they saved the same to document
> storage?

A1 has `[$-ru-RU-x-genlower]ДД ММММ ГГГГ \г\.;@` format.
Comment 12 Xisco Faulí 2020-07-22 14:57:43 UTC
Dear Eike Rathke,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assign it back to yourself if you're still working on this.
Comment 13 Roman Kuznetsov 2020-08-13 19:42:05 UTC
Eike, possibly it already fixed?
Comment 14 Eike Rathke 2020-08-17 10:20:03 UTC
No, it's not. Only some ground laying work done.