Bug 136615 - Re-consider date/time parts calculation for functions and formatting
Summary: Re-consider date/time parts calculation for functions and formatting
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
Hardware: All All
: medium normal
Assignee: Not Assigned
Depends on:
Reported: 2020-09-09 17:06 UTC by Mike Kaganski
Modified: 2020-09-16 12:10 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:

A sample of wrong round-tripping of date/time value (10.02 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-09-09 17:06 UTC, Mike Kaganski

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2020-09-09 17:06:08 UTC
Created attachment 165328 [details]
A sample of wrong round-tripping of date/time value

As discussed in Bug 127334 comment 8, bug 127170 (attachment 165304 [details]; bug 127170 comment 26) etc., there are some serious problems around date and time parts as calculated using spreadsheet functions YEAR/MONTH/DAY/HOUR/MINUTE/SECOND, and displayed using date/time format strings like "YYYY-MM-DD HH:MM:SS" (and any variations, including decimal parts of second); both in old versions of Calc, and in current versions, that had numerous tweaks related to how dates/times and durations are displayed.

Current implementation of YEAR/MONTH/DAY/HOUR/MINUTE/SECOND spreadsheet functions is consistent with ODF [1], but the standard itself is obviously wrong in its wording regarding these functions. E.g., MINUTE is defined there to be calculated discarding minute fraction from (hour fraction multiplied by 60), while SECOND is defined as rounding (minute fraction multiplied by 60) to nearest whole number, but at the same time, being in [0..59]. This necessarily produces wrong set of YEAR/MONTH/DAY/HOUR/MINUTE/SECOND values for any datetime with seconds between 59.5 and 60.0: result of MINUTE would be as if the next minute is not yet started, while result of SECOND would be as if it has already started. See attachment for an example of the problem.

At the same time, there is much confusion and disagreement around how date/times are displayed, where users don't realize that underlying data format for datetime is double, or where users disagree to account for the tiny floating-point calculation errors. Given the current distinction between display of points in time vs duration formats, the inherent imprecision of floating point operations may result in showing e.g. 9 minutes where users expect 10 (just because the closest double happened to be slightly less than exact value). It also looks natural to expect that a date string (using a date/time format string) represents the same date parts as returned from corresponding spreadsheet functions on the same value.

To create a consistent and robust set of rules for calculation of date/time parts from a serial date (or a duration), the proposal is the following:

1. In OpenFormula specification, change definition of SECOND to mention explicitly that the result is taken modulo 60, like this: Second=MOD(ROUND(T*86400);60).

2. In OpenFormula specification, change definition of YEAR/MONTH/DAY/HOUR/MINUTE (and others that might be affected) to account for the seconds rounding. Example for MINUTE: Minute=(MOD(ROUND(T*86400);3600)-SECOND(T))/60

3. Make date/time format strings follow the same rules, *unless* the format string includes decimal seconds part.

4. In case a date/time format string includes decimal seconds part, it should follow similar rules, with the exception that rounding should be performed not to seconds, but to 1/1000th of a second. This should give enough precision to have robustness against displaying rounding errors after tens/hundreds of arithmetical operations on times/durations in the representable date/time range. For the excess decimal second digits, zeroes must be output. The idea is that normally rounding to a second is enough for widest range of cases; only when explicitly requested, the sub-second precision should be used, still with precautions to provide robust results, not easily affected by inherent imprecision of floating-point operations and representation.

5. Duration formats should follow same or similar set of rules; whether they should round least significant displayed part other than second (as opposed to truncation) is not covered by the proposal, only that the values should be calculated based on original double converted to whole numbers of seconds or milliseconds (using rounding), as required by the format string and rules above.

This proposal is aimed to be compatible with rules used for the same operations in MS Excel, be predictable and sane, address most expectations, and allow creating clear documentation about implementation.

[1] https://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part2.html#SECOND
Comment 1 Albrecht Müller 2020-09-09 19:42:43 UTC
I think it is pretty clear that there is a need to correct the specification and that the defects in the specification block quite a few bugs.

Just for information: After pressing Shift+Control+F9 the round-tripping example shows correct values with LibreOffice Version Did not try with a current version yet.
Comment 2 Regina Henschel 2020-09-14 19:33:19 UTC
@Mike: Please write a proposal to the ODF TC:
Comment 3 Mike Kaganski 2020-09-16 12:10:23 UTC
(In reply to Regina Henschel from comment #2)