Bug 136615 - Re-consider date/time parts calculation for functions and formatting
Summary: Re-consider date/time parts calculation for functions and formatting
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL: https://issues.oasis-open.org/browse/...
Whiteboard: target:7.5.0
Keywords:
: 98443 159421 (view as bug list)
Depends on:
Blocks: ODF-spec
  Show dependency treegraph
 
Reported: 2020-09-09 17:06 UTC by Mike Kaganski
Modified: 2024-01-29 04:51 UTC (History)
10 users (show)

See Also:
Crash report or crash signature:


Attachments
A sample of wrong round-tripping of date/time value (10.02 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-09-09 17:06 UTC, Mike Kaganski
Details
Errors in date/time calculations (29.08 KB, application/vnd.oasis.opendocument.spreadsheet)
2022-07-28 07:25 UTC, Nenad Antic
Details
Showing date calc results including between different dates (90.81 KB, image/png)
2022-10-11 14:35 UTC, Nenad Antic
Details
Example with decimal seconds (25.84 KB, application/vnd.oasis.opendocument.spreadsheet)
2023-08-02 12:01 UTC, Regina Henschel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2020-09-09 17:06:08 UTC
Created attachment 165328 [details]
A sample of wrong round-tripping of date/time value

As discussed in Bug 127334 comment 8, bug 127170 (attachment 165304 [details]; bug 127170 comment 26) etc., there are some serious problems around date and time parts as calculated using spreadsheet functions YEAR/MONTH/DAY/HOUR/MINUTE/SECOND, and displayed using date/time format strings like "YYYY-MM-DD HH:MM:SS" (and any variations, including decimal parts of second); both in old versions of Calc, and in current versions, that had numerous tweaks related to how dates/times and durations are displayed.

Current implementation of YEAR/MONTH/DAY/HOUR/MINUTE/SECOND spreadsheet functions is consistent with ODF [1], but the standard itself is obviously wrong in its wording regarding these functions. E.g., MINUTE is defined there to be calculated discarding minute fraction from (hour fraction multiplied by 60), while SECOND is defined as rounding (minute fraction multiplied by 60) to nearest whole number, but at the same time, being in [0..59]. This necessarily produces wrong set of YEAR/MONTH/DAY/HOUR/MINUTE/SECOND values for any datetime with seconds between 59.5 and 60.0: result of MINUTE would be as if the next minute is not yet started, while result of SECOND would be as if it has already started. See attachment for an example of the problem.

At the same time, there is much confusion and disagreement around how date/times are displayed, where users don't realize that underlying data format for datetime is double, or where users disagree to account for the tiny floating-point calculation errors. Given the current distinction between display of points in time vs duration formats, the inherent imprecision of floating point operations may result in showing e.g. 9 minutes where users expect 10 (just because the closest double happened to be slightly less than exact value). It also looks natural to expect that a date string (using a date/time format string) represents the same date parts as returned from corresponding spreadsheet functions on the same value.

To create a consistent and robust set of rules for calculation of date/time parts from a serial date (or a duration), the proposal is the following:

1. In OpenFormula specification, change definition of SECOND to mention explicitly that the result is taken modulo 60, like this: Second=MOD(ROUND(T*86400);60).

2. In OpenFormula specification, change definition of YEAR/MONTH/DAY/HOUR/MINUTE (and others that might be affected) to account for the seconds rounding. Example for MINUTE: Minute=(MOD(ROUND(T*86400);3600)-SECOND(T))/60

3. Make date/time format strings follow the same rules, *unless* the format string includes decimal seconds part.

4. In case a date/time format string includes decimal seconds part, it should follow similar rules, with the exception that rounding should be performed not to seconds, but to 1/1000th of a second. This should give enough precision to have robustness against displaying rounding errors after tens/hundreds of arithmetical operations on times/durations in the representable date/time range. For the excess decimal second digits, zeroes must be output. The idea is that normally rounding to a second is enough for widest range of cases; only when explicitly requested, the sub-second precision should be used, still with precautions to provide robust results, not easily affected by inherent imprecision of floating-point operations and representation.

5. Duration formats should follow same or similar set of rules; whether they should round least significant displayed part other than second (as opposed to truncation) is not covered by the proposal, only that the values should be calculated based on original double converted to whole numbers of seconds or milliseconds (using rounding), as required by the format string and rules above.

This proposal is aimed to be compatible with rules used for the same operations in MS Excel, be predictable and sane, address most expectations, and allow creating clear documentation about implementation.

[1] https://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part2.html#SECOND
Comment 1 Albrecht Müller 2020-09-09 19:42:43 UTC
I think it is pretty clear that there is a need to correct the specification and that the defects in the specification block quite a few bugs.

Just for information: After pressing Shift+Control+F9 the round-tripping example shows correct values with LibreOffice Version 6.0.4.2. Did not try with a current version yet.
Comment 2 Regina Henschel 2020-09-14 19:33:19 UTC
@Mike: Please write a proposal to the ODF TC:
https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=office
Comment 3 Mike Kaganski 2020-09-16 12:10:23 UTC
(In reply to Regina Henschel from comment #2)

https://lists.oasis-open.org/archives/office-comment/202009/msg00001.html
Comment 4 Regina Henschel 2021-12-09 17:10:16 UTC
The ODF TC has improved the specification for ODF 1.4, see OFFICE-4094.
Comment 5 Regina Henschel 2021-12-09 17:11:50 UTC
*** Bug 98443 has been marked as a duplicate of this bug. ***
Comment 6 Albrecht Müller 2021-12-10 21:46:39 UTC
Some remarks to OFFICE-4094

a) Backwards compatibility. OFFICE-4094 introduces substantial changes to a couple of date and time functions. The previous specification has been around for about 10 years. According to the German Wikipedia LibreOffice has about 200 million users. Therefore I think there are many spreadsheet documents that were built according to the previous specification. They may deliver quite different results if the changes in the specification will be implemented. Therefore it should be specified how these functions should work on legacy spreadsheets.

b) Consistency of various LibreOffice date and time systems. Did someone check if the specifications for date and time calculations used in LibreOffice are consistent? I think the problem addressed in OFFICE-4094 is a special case of a more general problem: LibreOffice uses different representations of date and time. Some of them are able to represent common values exactly, such as 1/24 of a day corresponding to 1 hour. A date and time value formatted as a string is able to represent such a value exactly. On the other hand, the floating point number representation of date and time cannot represent 1/24 exactly and therefore has to use a value that best approximates 1/24. Problems arise if it is necessary to convert between different representations. This may affect:

Formatting of date and time values. I think the current version of LibreOffice uses different rounding strategies for e.g. [HH]:MM:SS and HH:MM:SS which may or may not correspond to the rounding strategy used for SECOND, MINUTE etc. functions. This is due to a distinction made between durations and wall clock time.

Calc functions: DAYS, DAYS360, DAYSINMONTH, DAYSINYEAR, EDATE, EOMONTH, ISLEAPYEAR, ISOWEEKNUM, MONTH, NETWORKDAYS,  NETWORKDAYS_EXCEL2003,NETWORKDAYS.INTL, WEEKDAY, WEEKNUM, WEEKNUM_EXCEL2003, WEEKS, WEEKSINYEAR,WORKDAYS, WORKDAY.INTL, YEARFRAC, YEARS

StarBasic-Functions related to date and time: Format, CDate, DateSerial, DateValue, Day, Month, WeekDay, Year, Hour, Minute, Second, TimeSerial, TimeValue, CDateToIso, CDateFromIso, CDateToUnoDate, CDateFromUnoDate, CDateFromUnoTime, CDateToUnoTime, CDateToUnoDateTime, CDateFromUnoDateTime, DateAdd, DateDiff, DatePart, Date, Now, Time, Timer

UNO-Interface: Interface com.sun.star.i18n.XCalendar (especially the functions setDateTime, getDateTime, setValue, getValue ) and functions that deal with struct com.sun.star.util.DateTime, struct com.sun.star.util.Date, struct com.sun.star.util.Time, struct com.sun.star.util.DateTimeRange , struct com.sun.star.util.DateTimeWithTimezone , struct com.sun.star.util.TimeWithTimezone and struct com.sun.star.util.Duration

Maybe there are other dependencies which I did not notice yet.

c) Compliance with other standards and interoperability:
The Standard ISO 8601 (see https://en.wikipedia.org/wiki/ISO_8601) allows fractional parts on any lowest order time element, not just on seconds. This standard also has a duration format that you cannot represent by floating point numbers alone, e.g. a duration of one month. Where does LibreOffice follow or deviate from the standard? Does the specification of LibreOffice date and time arithmetic make sure that they are compatible with other Programs, e.g. Excel? How should LibreOffice's date and time system interact with date and time representations used in other programming languages such as Python, Java and others?
Comment 7 Albrecht Müller 2021-12-12 22:31:30 UTC
In short: The proposed changes to spreadsheet functions handle the "YEAR, MONTH, DAY, HOUR, MINUTE, SECOND" but not the "and other related" part.
Comment 8 Albrecht Müller 2021-12-15 13:17:39 UTC
I think the problem is definitely not fixed yet as - without stating this explicitly - the specification now uses two different conventions that contradict each other.

There is a truncating convention where the transition from one day to the next occurs exactly at midnight. This convention is used for example in 4.11.7 Basis (https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#__RefHeading__1017922_715980110)

The other convention is the rounding convention where this transition occurs half a second earlier. The proposed definitions of the YEAR, MONTH, DAY, HOUR, MINUTE and SECOND functions use that convention.

So there is half a second each day where the conventions disagree to which day this time belongs. The bad thing is that the serial numbers that specify dates represent points in time that are exactly at boundaries of these disagreement regions.

There may be use cases where the serial numbers are the results of some calculations that include round off errors, e.g. because they use time values that don't have exact representations. Due to these round off errors up to 50% of the results may fall into the region where the conventions disagree.

Users may not be aware that they mix functions that use different conventions. Therefore they may see results that seem to be clearly wrong at random in up to 50% of the cases. Actually they observe a behaviour that follows from the specification. There is nothing an implementer of the functions can do about. The user has to use clumsy workarounds to get the expected results.

I tried to find out how OOXML (https://www.ecma-international.org/publications-and-standards/standards/ecma-376/) handles this rounding problem. As I was not not able to find relevant rules I think the OOXML specification left this point unspecified. This way it does not prescribe a behaviour that users may consider buggy.
Comment 9 Mike Kaganski 2021-12-15 13:35:19 UTC
(In reply to Albrecht Müller from comment #8)
> There is a truncating convention where the transition from one day to the
> next occurs exactly at midnight. This convention is used for example in
> 4.11.7 Basis
> (https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/
> OpenDocument-v1.3-os-part4-formula.html#__RefHeading__1017922_715980110)
> 
> The other convention is the rounding convention where this transition occurs
> half a second earlier. The proposed definitions of the YEAR, MONTH, DAY,
> HOUR, MINUTE and SECOND functions use that convention.

You put it not as the standard puts it. Specifically, there is a possible ambiguity, but not a clear conflict.

Namely, the "truncate" at 4.11.7 is defined as:

> truncate(date) truncates any fractional (hours, minutes, seconds) of a date
> value and returns the whole date portion.

It is not defined in terms of floating-point number, but in terms of fractional part of a date, so the question remains, how the fractions of a date are calculated here, to be discarded.

Anyway, everything in comment 8 is unrelated to a specific problem that this issue is dedicated: inconsistency in the *related* set of functions. The interrelations with other functions, if documented, may be arbitrarily complex and still be acceptable. The change in standard is the step toward consistency in one specific part, and improvements must be done step by step. Trying to create noise in each and every issue which mentions "Calc" and "time", just because a fix to that issue does not make the world the ideal place, is not constructive.

But yes, this problem is definitely not fixed yet - as the change was only at standard level, not in LibreOffice code.
Comment 10 Albrecht Müller 2021-12-15 23:02:32 UTC
(In reply to Mike Kaganski from comment #9)
> … so the question remains, how the fractions of a
> date are calculated here, to be discarded.

You just discovered another bug in the specification: The sentence you refer to should have a clear meaning. By the way, take it together with 

> 4.3.3 Date
> Date is a subtype of Number.
> Date is represented by an integer value.

This means a date value by definition has no hours, minutes or seconds to discard. So what is the intended meaning of "truncate(date)" anyway?

What are *related* functions?

Assume T = 1999-12-31 23:59:59,600. LibreOffice Version 7.0.4.2 returns 1999 for YEAR(T) and 0 (i.e. no) for ISLEAPYEAR(T). What will happen after you implemented the proposed changes? YEAR(T) = 2000 and ISLEAPYEAR(T) = no?

Or Assume T1 = 2021-12-15 23:59:59,600 and T2 = 2021-12-16 00:00:00,000. 
LibreOffice 7.0.4.2 delivers the following results:

DAY(T1) = 15, DAY(T2) = 16, NETWORKDAYS(T1;T2) = 2 and NETWORKDAYS(T2;T2) = 1.

After the changes I would expect DAY(T1) = DAY(T2) = 16, but what NETWORKDAYS should return? The same results as before despite the fact that - according to the YEAR, MONTH, DAY … functions  - the arguments appear to be identical?

The headline of OFFICE-4094 contains the clause "… and other related" but I did not see a list that names these functions. That's why I looked for things that I think are related to the intended changes to get some starting point. Unfortunately the resulting list became pretty long.

I also thought that having names for different semantics of date and time values makes it easier to talk about the differences and their consequences, e.g. when you have to trade consistency in one part for inconsistencies somewhere else. One example is interoperability: I tried NETWORKDAYS on Exel97 and got the impression, that NETWORKDAYS uses truncating semantics, while the YEAR, MONTH … functions use rounding semantics. As I don't have a current version available I cannot check if Microsoft changed this behaviour later. If they did not, you have to choose if you want to have internal consistency or compatibility with Excel.


> Trying to create noise … is not constructive.
OK, with these remarks I conclude the noise.
Comment 11 Regina Henschel 2021-12-15 23:39:54 UTC
The argument for functions DATE, MONTH and YEAR (and some others) has type 'DateParam', see line 'Syntax:'. The type 'DateParam' is specified in section 4.11.3 in part 4. The type 'DateParam' is different from type 'Date'.
Comment 12 SheetJS 2022-07-18 21:12:34 UTC
There are two internally consistent ways to render date-time formats:

1) always round to the most granular date token

2) render to a standardized format and extract parts.  Given the support for milliseconds, this format would be "yyyy-mm-dd hh:mm:ss.000"

Right now, as described in https://bugs.documentfoundation.org/show_bug.cgi?id=150049 , LO is inconsistent.  Either interpretation ("rounding" / "truncation: respectively) is fine, but it should be consistent and it should be clearly documented (especially if the decision is to clash with Excel)


OOXML (Excel) definitely uses some sort of rounding.  There is a test case in 18.17.4.2:

> The serial date-time 0.0000115... represents 00:00:01

0.0000115 is 0.9936 / 86400, just shy of 1/86400 .  Truncation would imply "00:00:00" and rounding would imply "00:00:01".  The reasonable conclusion is that OOXML examples seems to imply rounding.


Needless to say, LO is clashing with Excel:

=TEXT(0.0000115, "hh:mm:ss")

Excel and OOXML both yield 00:00:01 while LO yields 00:00:00
Comment 13 Nenad Antic 2022-07-28 07:25:12 UTC
Created attachment 181467 [details]
Errors in date/time calculations

I was going to file a new bug for this error.  But it seems the error could be related to this bug report.

Please look in cells G11, G13 and G20. Compare to other cells in column 'G'.

I have noticed that it usually occurs when time is 8:00 or 14:00 in the cell with start time.

If the date part, YYYY-MM-DD, is removed in the start time cell (column 'D'), leaving only HH:MM, then calculation is correct in column 'G' for the offending cells.
Comment 14 Eike Rathke 2022-10-11 13:19:30 UTC
(In reply to SheetJS from comment #12)
> > The serial date-time 0.0000115... represents 00:00:01
> 
> 0.0000115 is 0.9936 / 86400, just shy of 1/86400 .  Truncation would imply
> "00:00:00" and rounding would imply "00:00:01".  The reasonable conclusion
> is that OOXML examples seems to imply rounding.
> 
> Needless to say, LO is clashing with Excel:
> =TEXT(0.0000115, "hh:mm:ss")
> Excel and OOXML both yield 00:00:01 while LO yields 00:00:00
Yes, and if Excel is rounding 0.999999930555556 (=86399.994/86400) which is 23:59:59.994 to 00:00:00 that's obviously wrong.
Comment 15 Eike Rathke 2022-10-11 13:51:42 UTC
(In reply to Nenad Antic from comment #13)
> Please look in cells G11, G13 and G20. Compare to other cells in column 'G'.
> 
> I have noticed that it usually occurs when time is 8:00 or 14:00 in the cell
> with start time.
(9:00 instead of 14:00?) Apart from that it is related to the underlying values being IEEE 754 double floating point values (2022-07-04 08:00:00 = 44746.3333333333 and 2022-07-04 16:00:00 = 44746.6666666667 and 2022-07-06 09:00:00 = 44748.375), which calculating with does not deliver exact decimal results, using the HOUR() and MINUTE() functions that both are *defined* to truncate the value on such calculations (see https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#MINUTE ) is counter-productive. Instead of

> =IF(HOUR(E11-D11)+(MINUTE(E11-D11)/60)-(F11/60)=0,"",HOUR(E11-D11)+(MINUTE(E11-D11)/60)-(F11/60))

simply use

=IF(AND(ISBLANK(D11),ISBLANK(E11)),"",ROUND((E11-D11)*24,2))

to subtract start from end date+time and express the difference as hours of duration.

You'll also notice that most of your results will be unexpected because starting at row 14 date+time in column D and E are of different days and the result is a duration of 33 hours instead of 9 hours.
Comment 16 Nenad Antic 2022-10-11 14:33:41 UTC
(In reply to Eike Rathke from comment #15)

> simply use
> 
> =IF(AND(ISBLANK(D11),ISBLANK(E11)),"",ROUND((E11-D11)*24,2))
> 
> to subtract start from end date+time and express the difference as hours of
> duration.

Thank you for this tip. I'll try it.

> 
> You'll also notice that most of your results will be unexpected because
> starting at row 14 date+time in column D and E are of different days and the
> result is a duration of 33 hours instead of 9 hours.

This is strange. I have used it like this for years and it works. Please see my new attachment. If it's showing 33 hours at your end it seems like it's platform dependent.
Comment 17 Nenad Antic 2022-10-11 14:35:42 UTC
Created attachment 182973 [details]
Showing date calc results including between different dates

In response to comment 15.
Comment 18 Eike Rathke 2022-10-12 18:39:25 UTC
(In reply to Nenad Antic from comment #16)
> > You'll also notice that most of your results will be unexpected because
> > starting at row 14 date+time in column D and E are of different days and the
> > result is a duration of 33 hours instead of 9 hours.
> 
> This is strange. I have used it like this for years and it works. Please see
> my new attachment. If it's showing 33 hours at your end it seems like it's
> platform dependent.
No. It works in your document because using only HOUR() and MINUTE() of the difference ignores the date part. The time span between (E14) 2022-07-07 07:00:00 and (F14) 2022-07-08 16:00:00 is certainly not 9 hours but 33. Which becomes visible when using my formula to calculate. If the 9 hours is expected then the data is not correct (which in this case is likely).
Comment 19 Nenad Antic 2022-10-18 13:40:59 UTC
> If the 9 hours is expected then the data is not correct (which in this case is likely).

The 9 hours is expected. The reason for this is that the source data is coming from a calendar export, and the time duration over several days indicates the same hour interval on each day. That's why the formulas ignore the date parts.
Comment 20 Nenad Antic 2022-10-18 13:44:45 UTC
(In reply to Eike Rathke from comment #18)

> Which becomes visible when using my formula to calculate. If the 9 hours is
> expected then the data is not correct (which in this case is likely).

The 9 hours is expected. The reason for this is that the source data is coming from a calendar export, and the time duration over several days indicates the same hour interval on each day. That's why the formulas ignore the date parts.
Comment 21 Eike Rathke 2022-10-19 16:18:08 UTC
In that case use

=IF(AND(ISBLANK(D11);ISBLANK(E11));"";ROUND(MOD(E11-D11;1)*24;2))
Comment 22 SheetJS 2022-10-22 05:45:27 UTC
> Yes, and if Excel is rounding 0.999999930555556 (=86399.994/86400) which is
> 23:59:59.994 to 00:00:00 that's obviously wrong.

The lowest time unit is seconds, so Excel rounds to the nearest second.  The `hh` format is the time in hours, so when rounding you get `00`.  If you want to know the duration, Excel's format token is `[hh]` (hh wrapped in square brackets).

It would help to look at these formats in Excel:

```
=TEXT(0.999999930555556,"hh:mm:ss")     ## No msec, relative time -> "00:00:00"
=TEXT(0.999999930555556,"[hh]:mm:ss")   ## No msec, absolute time -> "24:00:00"
=TEXT(0.999999930555556,"hh:mm:ss.000") ## round to milliseconds ->  "23:59:59.994"
```

This is internally consistent rounding logic, specifically "1) always round to the most granular date token"

.

LibreOffice, by contrast, appears to be inconsistent.  Testing the same examples:

```
=TEXT(0.999999930555556,"hh:mm:ss")     ## 23:59:59     (truncate)
=TEXT(0.999999930555556,"[hh]:mm:ss")   ## 24:00:00     (round to  sec)
=TEXT(0.999999930555556,"hh:mm:ss.000") ## 23:59:59.994 (round to msec)
```
Comment 23 Nenad Antic 2022-10-22 14:37:52 UTC
(In reply to Eike Rathke from comment #21)
> In that case use
> 
> =IF(AND(ISBLANK(D11);ISBLANK(E11));"";ROUND(MOD(E11-D11;1)*24;2))

Fantastic! Thanks! Solves my problem.
Comment 24 Eike Rathke 2022-10-24 12:46:44 UTC
(In reply to SheetJS from comment #22)
> LibreOffice, by contrast, appears to be inconsistent.  Testing the same
> examples:
> 
> =TEXT(0.999999930555556,"hh:mm:ss")     ## 23:59:59     (truncate)
> =TEXT(0.999999930555556,"[hh]:mm:ss")   ## 24:00:00     (round to  sec)
> =TEXT(0.999999930555556,"hh:mm:ss.000") ## 23:59:59.994 (round to msec)

Au contraire, that's very consistent. "[hh]:mm:ss" is a duration format that rounds, "hh:mm:ss" and "hh:mm:ss.000" are clock formats that don't round to sec/min/hour. Your clock doesn't display 00:00:00 if it's 23:59:59.9

However, what is still inconsistent is that
=TEXT(44858.9999999306;"yyyy-mm-dd hh:mm:ss")
results in  2022-10-25 00:00:00  instead of  2022-10-24 23:59:59
whereas
=TEXT(44858.9999999306;"yyyy-mm-dd hh:mm:ss.000")
correctly results in  2022-10-24 23:59:59.994
Comment 25 Commit Notification 2022-10-24 15:08:35 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5ce6de864380f1eabbd78656ff6cc31920c534d2

Related: tdf#136615 Do not round a DateTime clock format into the next day

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 26 SheetJS 2022-10-24 18:12:51 UTC
> Au contraire, that's very consistent. "[hh]:mm:ss" is a duration format that
> rounds, "hh:mm:ss" and "hh:mm:ss.000" are clock formats that don't round to
> sec/min/hour. Your clock doesn't display 00:00:00 if it's 23:59:59.9

If the stated interpretation is correct, the inconsistency is that "hh:mm:ss" and "hh:mm:ss.000" currently round.  The example was discussed in issue https://bugs.documentfoundation.org/show_bug.cgi?id=150049 :

```
TEXT(0.00069, "hh:mm:ss.000") ## 00:00:59.616
TEXT(0.00069, "hh:mm:ss.00")  ## 00:00:59.62  rounding?
```

"Clock formats that don't round" suggests that the results should be:

"hh:mm:ss"     => 00:00:59
"hh:mm:ss.0"   => 00:00:59.6
"hh:mm:ss.00"  => 00:00:59.61
"hh:mm:ss.000" => 00:00:59.616
Comment 27 Regina Henschel 2023-07-30 15:23:11 UTC
Hi Eike, you have worked on it. Do the results in LibreOffice match the definitions in upcoming ODF 1.4?

The draft of the upcoming ODF 1.4 is in
https://github.com/oasis-tcs/odf-tc/tree/master/docs/odf1.4/part4-formula

Here an extract:
Second = MOD(ROUND(T * 86400) ; 60)
Minute = (MOD(ROUND(T * 86400) ; 3600) - Second) / 60
Hour = (MOD(ROUND(T * 86400) ; 86400) - Minute * 60 - Second) / 3600
DAY( DateParam D ); Returns the day portion of D after first rounding (using ROUND()) to the nearest second.
MONTH( DateParam D ); Returns the month portion of D after first rounding (using ROUND()) to the nearest second.
YEAR( DateParam D ); Returns the year portion of D after first rounding (using ROUND()) to the nearest second.
Comment 28 Eike Rathke 2023-07-31 09:27:34 UTC
No, and to me the definition makes no sense, as it rounds any day's 23:59:59.6 into the next day's 00:00:00. Yes that is what Excel does, and yes it's what people expect _for that reason_, but yes, it's wrong.
Comment 29 Albrecht Müller 2023-07-31 15:29:21 UTC
@Regina Henschel:
You know the Open Document Format Specification better than I do: Does it define the semantics of format strings, especially those related to date and time? Does it specify how the interpretation of these strings is related to the specification of the date and time functions you mention in Comment #27?

I fear that these things are not specified and that this is a big hole in the specification: Compliant applications are free to implement different and incompatible interpretations of format strings. The discussion shows that there are many ways how to do this. In an environment where working time is paid for it may be cheaper to pay license fees to Microsoft than to pay for the time necessary to handle the undocumented subtle but nevertheless important differences between these ways.

@Eike Rathke:
Rounding any day's 23:59:59.6 into the next day's 00:00:00 is related to another thing: Adding exactly one minute to another exact minute should always give two minutes – not sometimes one minute and sometimes two minutes. You may want to have a look at the document “Trying to clarify clock vs duration time, version 0.0.1” (see Bug 127170, attachment 165304 [details]) to read about the mathematical background. I think there is simply a design decision between having the change of day exactly at midnight and getting elementary arithmetic right; you cannot have both. If I had a wall clock showing time units in the order of magnitude of round off errors I would not be able to read the values anyway. Therefore I would prefer correct elementary arithmetic on hours, minutes and seconds. On the other hand I understand that there are good reasons do put the day change exactly at midnight: There are a lot of functions other than SECOND, MINUTE etc. that probably rely on a day change exactly at midnight. So there is the ugly situation that LibreOffice uses two different interpretations of time values and that this is an undocumented feature.
Comment 30 Regina Henschel 2023-08-02 12:01:38 UTC
Created attachment 188724 [details]
Example with decimal seconds

@Albrecht Müller:  The standard does not use 'format strings' in this context but provides several elements and attributes. You find these searching for prefix 'number:' in the standard. In regard to 'format string', such are used in the TEXT function. The meaning is implementation-defined. The definition for LibreOffice is in https://help.libreoffice.org/latest/en-US/text/shared/01/05020301.html. However, the formatting of the functions result is not relevant here. 

@Eike: The attachment lists the results from several application. These results show, that a standardization is indeed needed. Since ODF 1.4 has the 'round' variant, LibreOffice needs to provide a function according this standard. Eike, do you have an idea how we can do it? The problem I see is, that existing documents would have a different result in ODF 1.4 than in ODF 1.3, if we simply change the algorithms. Background: I have started writing tender proposals for transition to ODF 1.4 for budget 2024.
Comment 31 Eike Rathke 2023-08-02 17:57:09 UTC
(In reply to Albrecht Müller from comment #29)
> @Eike Rathke:
> Rounding any day's 23:59:59.6 into the next day's 00:00:00 is related to
> another thing: Adding exactly one minute to another exact minute should
> always give two minutes – not sometimes one minute and sometimes two
> minutes.
I think you are mixing up things. Or at least the context and conclusion/example doesn't make sense to me.

> You may want to have a look at the document “Trying to clarify
> clock vs duration time, version 0.0.1” (see Bug 127170, attachment 165304 [details]
> [details]) to read about the mathematical background.

Repeating my latest comment from that bug 127170:
Note that the [there] attached sample document already latest since LO 6.2.8 (probably 6.2.5 for which bug 125099 was fixed) when recalculated produces the expected results even with the HH:MM:SS format and in column D has 0.0 differences. The upcoming 24.2 version even further refines the calculation of (date+)time differences aka duration to eliminate a possible slight accuracy error due to IEEE 754 double floating point values.


(In reply to Regina Henschel from comment #30)
> @Eike: The attachment lists the results from several application. These
> results show, that a standardization is indeed needed. Since ODF 1.4 has the
> 'round' variant, LibreOffice needs to provide a function according this
> standard.
As mentioned elsewhere, I consider that rounding specification *WRONG* and advise *AGAINST* it. Some major implementations (Excel, Gnumeric, Google) doing it wrong to be compatible with Excel doesn't make it right.

> Eike, do you have an idea how we can do it?
No. You can have either or, not both. Or implement the bad specification and introduce another set of functions like CLOCK.SECOND() and so on to obtain the unrounded matching wall clock values.

> The problem I see is,
> that existing documents would have a different result in ODF 1.4 than in ODF
> 1.3, if we simply change the algorithms.
Yes, that may happen.
Comment 32 Albrecht Müller 2023-08-03 11:25:19 UTC
(In reply to Eike Rathke from comment #31)
> (In reply to Albrecht Müller from comment #29)
> > @Eike Rathke:
> > Rounding any day's 23:59:59.6 into the next day's 00:00:00 is related to
> > another thing: Adding exactly one minute to another exact minute should
> > always give two minutes – not sometimes one minute and sometimes two
> > minutes.
> I think you are mixing up things. Or at least the context and
> conclusion/example doesn't make sense to me.
> 

So let me explain this a little bit. I guess – please tell me if I am wrong – that one of the reasons why you don’t like Microsoft's approach to date and time arithmetic is that you think that a day change should occur exactly at midnight. If you want to have this property you have to use a round down strategy. Unfortunately there is no exact floating point representation for common time units shorter than a day such as hours, minutes, and seconds. For the sake of argument lets assume that the internal representation of 1 Minute is 1.1. Adding another minute to this gives 2.2. What the user sees if a round down strategy is used is 1 + 1 = 2. Fine. But lets assume that the internal representation is 0.9. So adding two minutes together gives 1.8. What the user sees now is 0 + 0 = 1. Not so good.

The same thing using a rounding to the nearest strategy: No matter if the actual calculation is 1.1 + 1.1 = 2.2 or 0.9 + 0.9 = 1.8, in both cases the equation that shows the rounded values is 1 + 1 = 2. The downside of this rounding strategy is that values before midnight get rounded to the next day. Thus the day change is not exactly at midnight.

I fear that you have invested a lot of time optimizing the use of the rounding to the nearest strategy built into the floating point arithmetic. You may have been able to eliminate the problem for some classes of calculations. But I assume that this work will not solve the underlying fundamental problem and therefore it will reappear if you use different examples, e.g. examples that contain multiplies of inexact time values.
Comment 33 Albrecht Müller 2023-08-03 13:57:53 UTC
(In reply to Regina Henschel from comment #30)
> Created attachment 188724 [details]
 
> @Albrecht Müller:  The definition for LibreOffice is in
> https://help.libreoffice.org/latest/en-US/text/shared/01/05020301.html.
> However, the formatting of the functions result is not relevant here. 

Thank you for the link - I already knew the German version of it. I was not able to find there any information about the distinction between wall clock time and durations. If I understand Eike Rathke's concept correctly a [HH]:MM:SS format should use a kind of rounding that is different from the rounding used for HH:MM:SS formats. I did not find any information about this.

I think formatting could play a key role in the specification. A user may probably expect that formatting and the second, minute functions etc. are consistent. This expectation could be something like:

If the format "YYYY-MM-DD HH:MM:SS" formats some time value T to a string of the form "<Year>-<Month>-<Day> <Hour>:<Minute>:<Second>"
then the integer value represented by the parts <Year>, <Month> etc.
should always be the values returned by the corresponding functions for the argument T. This way the specification of these function follows as a corollary from the specification of the formatting. Its just a special case.

You can put another example in your zoo: LibreOffice itself offers an additional and on my machine incompatible implementation of the Second, Minute etc. functions. The subroutine below is intended to produce the results for the values you use the attachment.

Sub [Second, Minute etc.]()
   On Local Error Goto Errorhandler
   Dim V1 As Double
   Dim V2 As Double
   Dim V3 As Double
   Dim S1
   Dim S2
   Dim S3
   V1 = 45138
   V2 = 45138.999996
   V3 = 45138.999994
   S1 = Array("2023-07-31", "00:00:00,000")
   S2 = Array("2023-07-31", "23:59:59,654")
   S3 = Array("2023-07-31", "23:59:59,482")
   Dim D1 As Double
   Dim D2 As Double
   Dim D3 As Double
   D1 = DateValue(S1(Lbound(S1))) + TimeValue(S1(Ubound(S1))) 
   D2 = DateValue(S2(Lbound(S2))) + TimeValue(S2(Ubound(S2))) 
   D3 = DateValue(S3(Lbound(S3))) + TimeValue(S3(Ubound(S3))) 
   MsgBox(_
      V1 & "; " & Format(V1, "YYYY-MM-DD HH:MM:SS.000") & _
         "; Y: " & year(v1) & "; M: " & month(v1) & "; D: " & day(v1) _
         & "; H: " & hour(v1) & "; M: " & minute(v1) & "; S: " _
         & second(v1) & Chr$(13) _
      & V2 & "; " & Format(V2, "YYYY-MM-DD HH:MM:SS.000") & _
         "; Y: " & year(V2) & "; M: " & month(V2) & "; D: " & day(V2) _
         & "; H: " & hour(V2) & "; M: " & minute(V2) & "; S: " _
         & second(V2) & Chr$(13) _
      & V3 & "; " & Format(V3, "YYYY-MM-DD HH:MM:SS.000") & _
         "; Y: " & year(V3) & "; M: " & month(V3) & "; D: " & day(V3) _
         & "; H: " & hour(V3) & "; M: " & minute(V3) & "; S: " _
         & second(V3) & Chr$(13) _
      & D1 & "; " & S1 (Lbound(S1)) & " " & S1 (Ubound(S1)) & _
         "; Y: " & year(D1) & "; M: " & month(D1) & "; D: " & day(D1) _
         & "; H: " & hour(D1) & "; M: " & minute(D1) & "; S: " _
         & second(D1) & Chr$(13) _
      & D2 & "; " & S2 (Lbound(S2)) & " " & S2 (Ubound(S2)) & _
         "; Y: " & year(D2) & "; M: " & month(D2) & "; D: " & day(D2) _
         & "; H: " & hour(D2) & "; M: " & minute(D2) & "; S: " _
         & second(D2) & Chr$(13) _
      & D3 & "; " & S3 (Lbound(S3)) & " " & S3 (Ubound(S3)) & _
         "; Y: " & year(D3) & "; M: " & month(D3) & "; D: " & day(D3) _
         & "; H: " & hour(D3) & "; M: " & minute(D3) & "; S: " _
         & second(D3) & Chr$(13) _
   )   
   Exit Sub
Errorhandler:
   MsgBox "Fehler " & Err & ": " & Error$ & chr$(13) _
   & "in Subroutine [Second, Minute etc.](). Zeile: " & Erl _
   & chr$(13) & now
End Sub
Comment 34 Eike Rathke 2023-08-03 15:33:46 UTC
(In reply to Albrecht Müller from comment #32)
> For the sake of argument lets assume that the internal representation of 1
> Minute is 1.1. Adding another minute to this gives 2.2. What the user sees
> if a round down strategy is used is 1 + 1 = 2. Fine. But lets assume that
> the internal representation is 0.9. So adding two minutes together gives
> 1.8. What the user sees now is 0 + 0 = 1. Not so good.
You may assume whatever, but that is just not happening.

> I fear that you have invested a lot of time optimizing the use of the
> rounding to the nearest strategy built into the floating point arithmetic.
> You may have been able to eliminate the problem for some classes of
> calculations. But I assume that this work will not solve the underlying
> fundamental problem and therefore it will reappear if you use different
> examples, e.g. examples that contain multiplies of inexact time values.
Yes, summing multiple inaccurate values with SUM() will produce greater inaccuracies, though the Kahan summation algorithm eliminates quite a portion already. Note also that operator+/operator- with two operands now (for upcoming 24.2) does things differently and detects if (date+)time is involved, and if so uses a duration class that eliminates an accuracy error in the nanoseconds range to land on an exact second if indicated, which when converted back to a floating point value delivers a better result.
Comment 35 Eike Rathke 2023-08-03 16:35:38 UTC
Anyway, the Excel design of using floating point values to express date+time is fundamentally wrong, and with that inaccuracies are inevitable. Rounding in one or the other direction will produce wrong values for some cases no matter what and is only cosmetic to calm the user. The real solution would be to get rid of that and use a proper DateTime type instead and convert to floating point only when needed, e.g. when exporting to Excel file formats or calculations require it or if a floating point display is requested.
Comment 36 Albrecht Müller 2023-09-13 14:27:17 UTC Comment hidden (off-topic)
Comment 37 Albrecht Müller 2023-09-19 14:39:45 UTC
(In reply to Eike Rathke from comment #31)
> 
> (In reply to Regina Henschel from comment #30)
> > @Eike: The attachment lists the results from several application. These
> > results show, that a standardization is indeed needed. Since ODF 1.4 has the
> > 'round' variant, LibreOffice needs to provide a function according this
> > standard.
> As mentioned elsewhere, I consider that rounding specification *WRONG* and
> advise *AGAINST* it. Some major implementations (Excel, Gnumeric, Google)
> doing it wrong to be compatible with Excel doesn't make it right.
> 
> > Eike, do you have an idea how we can do it?
> No. You can have either or, not both. Or implement the bad specification and
> introduce another set of functions like CLOCK.SECOND() and so on to obtain
> the unrounded matching wall clock values.
> 
> > The problem I see is,
> > that existing documents would have a different result in ODF 1.4 than in ODF
> > 1.3, if we simply change the algorithms.
> Yes, that may happen.

This may not be as bad as it seems. Implementers not related to LibreOffice may have noticed that the algorithm given in the specification of the SECOND function returns 60 if the argument of the function happens to fall in the last half second of a minute but the specification also requires that the return values are in the range of 0 to 59. Thus it is impossible to implement the SECOND function in the way it is specified. They may also have noticed that the round down method used in the specification of the MINUTE and other functions have the unpleasant properties which I tried to explain in comment #32 and which surfaced in Bug 127334 (which in my opinion is not fixed – using an empirically determined unsharpness to silence an alarm introduces just another incompatible change but does not restore compatibility). Thus they probably decided to ignore this part of the specification altogether and to implement these functions in the Excel way. As I understand it the new specification tries to describe how this Excel way works. Therefore nothing will change for the users of these implementations. The only change is that these implementations will become conformant to the new specification.

The situation is different for users of LibreOffice: Several versions of LibreOffice that produce different results already exist. Maybe the users will appreciate it that there will be a hopefully last change towards a de facto industry standard that seems to have been stable for more than a quarter of a century.

I think there is a deficiency that is common to both the old and the new specification: They both are like an uncommented piece of code as they specify what has to be done but not why. 

An important “why” is compatibility with Excel. Calc users should be able to easily exchange calculations with Excel users. This compatibility also preserves the value of the knowledge about the strengths and weaknesses of Excels method when another software is used. There are good reasons why Excels way of doing this kind of calculations is so popular. It is a simple and clever compromise that covers the requirements of a lot of use cases:

Time is often seen as a continuous physical quantity which has a wide dynamic range. The time the light needs to travel a distance of the size of an atom is quite short. The age of the universe is an example of a much longer timespan. Given a suitable base unit IEEE 754 double values can represent both time values. IEEE 754 double arithmetic also provides the mathematical operations needed for calculations with time in physical context.

On the other hand, date and time calculations use the idea of a discrete time where time is measured by counting equidistant clock ticks. This allows the use of exact integer arithmetic which avoids the round off errors of floating point arithmetic. The dynamic range is much smaller. Times shorter than a second or longer than a few 1000 years are rarely used in this context.

Excel seems to round to the nearest integer multiple of some base unit (usually a second, sometimes microseconds, maybe other units) when translating from continuous to discrete time. I think this rounding method has an interesting property: As long as the round off errors are somewhat less than half of the base unit it does not matter if you use inexact floating point calculations and round the result or if you round the arguments first and use exact integer arithmetic afterwards. In both cases the result will be the same. Due to the precision of double values round off errors that exceed half of a second almost never occur in everyday calculations. Thus users don’t have to care about round off errors and can use floating point calculations as if floating point arithmetic would provide exact calculations with hours, minutes and seconds.

Excel assumes that a day always has 24 hours, an hours always has 60 minutes and a minute 60 seconds. This assumption corresponds to a common perception, makes a lot of calculations with date and time quite simple and works for many use cases where leap seconds or switching between daylight saving and standard time does not matter. I think that using the same rounding to the nearest strategy it would be possible to build date and time calculations that supports daylight saving time and leap seconds. However, if you do so date and time calculations will become much more complicated as you have to treat days, hours and minutes as quantities of indeterminate length in a similar way as you have to treat months or years. The gain of this kind of date and time arithmetic is probably not worth the trouble.

The specification should also point out that it uses at least one incompatible other method to translate from continuous time to discrete time. Consider for example “Procedure A” for the calculation of the “Basis” that is used in financial calculations and which is specified in https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#Basis-ProcedureA. Here date values are truncated, i.e. rounded down to the start of a day. The authors of this specification were probably not aware that date values may be a few microseconds less than the exact start of the intended day. This can happen when the values are the result of some floating point calculations that contain round off errors. In this case the truncation operation drops a complete day due to this tiny error. A naive user may wonder why financial functions can return different values for the (in the sense of YEAR, MONTH and DAY functions according to the new specification) same arguments.
Comment 38 Regina Henschel 2023-09-19 19:10:50 UTC
There is a new proposal in https://issues.oasis-open.org/browse/OFFICE-4094
It adresses the 60sec problem by removing "round" and it introduces a new parameter, that distinguishes betwenn "truncating the input value" and "rounding the input value in the way Excel does it".
Comment 39 Eike Rathke 2023-09-20 12:02:36 UTC
(In reply to Regina Henschel from comment #38)

That's horrible. It guarantees that if the parameter was used, interoperability will fail with *ALL* implementations that don't implement it, i.e. *ALL* current implementations.

Furthermore, for SECOND() with [ ; Logical Truncate = TRUE ] importing existing documents (Excel or not, any) without that argument change the behaviour, and exporting without will change behaviour in implementations not knowing the parameter, i.e. probably all currently existing implementations (because they implemented the sick Excel behaviour mandated by the current ODFF specification). But that (changing behaviour) is the same situation as it would be with the following simple redefinition:

I suggest to rather only redefine SECOND() to always truncate and keep the already existing (truncating) definition for all other functions touched in that proposal, so values are never unexpectedly rounded into the next higher unit, worst case the year after.
Comment 40 Regina Henschel 2023-09-20 18:24:56 UTC
(In reply to Eike Rathke from comment #39)
> (In reply to Regina Henschel from comment #38)
> 
> That's horrible. It guarantees that if the parameter was used,
> interoperability will fail with *ALL* implementations that don't implement
> it, i.e. *ALL* current implementations.

> Furthermore, for SECOND() with [ ; Logical Truncate = TRUE ] importing
> existing documents (Excel or not, any) without that argument change the
> behaviour, and exporting without will change behaviour in implementations
> not knowing the parameter, i.e. probably all currently existing
> implementations (because they implemented the sick Excel behaviour mandated
> by the current ODFF specification).

I do not see that. If the parameter is missing, the functions behave as specified in ODF 1.3. Only the contradiction in ODF 1.3 is removed, that ODF 1.3 on one hand specifies to round in the SECOND function resulting in range 0 to 60, and on the other hand specifies a range of 0 to 59 for the SECOND function. The other functions all truncate in ODF 1.3. So applications having implemented ODF 1.3 should produce the same results as in ODF 1.3, when they get an ODF 1.4 document without parameter. I think, that is a good backward compatibility.


 But that (changing behaviour) is the
> same situation as it would be with the following simple redefinition:
> 
> I suggest to rather only redefine SECOND() to always truncate and keep the
> already existing (truncating) definition for all other functions touched in
> that proposal, so values are never unexpectedly rounded into the next higher
> unit, worst case the year after.

How would you then solve the request to have functions, that behave like Excel?
Comment 41 Albrecht Müller 2023-09-23 11:17:03 UTC
(In reply to Regina Henschel from comment #40)
> ... 
> How would you then solve the request to have functions, that behave like
> Excel?

Due to the way the ODF specification is written this problem is not so easy to solve as it looks. It may help trying to analyse it thoroughly before taking action. Eike probably is against the rounding specification because he knows that this will make the YEAR, MONTH, DAY etc. functions inconsistent with the specifications of many other functions that somehow do date and time calculations. On the other hand this implies that all these functions may show pseudo random results caused by truncation. For an example see the last paragraph of my comment #37. As a consequence all date and time related functions need an overhaul… Just my 2 ¢.
Comment 42 Albrecht Müller 2023-09-23 14:05:45 UTC
(In reply to Eike Rathke from comment #39)
...
> I suggest to rather only redefine SECOND() to always truncate and keep the
> already existing (truncating) definition for all other functions touched in
> that proposal, so values are never unexpectedly rounded into the next higher
> unit, worst case the year after.

So if F is one of the YEAR, MONTH, DAY, HOUR, MINUTE, SECOND functions then in general the result of Microsoft’s version of F applied to some time value t would be the same as the result of ODF’s version of F applied to (t + ½ second)?
Comment 43 Eike Rathke 2023-09-25 12:40:35 UTC
I give up.
Specify *ALL* those functions to truncate to return *correct* values (i.e. change only the definition of SECOND() to Second = INT(MinuteFraction * 60)) and be incompatible with Excel, or specify *ALL* functions to return the sick Excel rounded values for interoperability.

But please do not introduce optional parameters to functions that are established for decades and would cause pain for all interoperability handling.

(In reply to Regina Henschel from comment #40)
> I do not see that. If the parameter is missing, the functions behave as
> specified in ODF 1.3. Only the contradiction in ODF 1.3 is removed, that ODF
> 1.3 on one hand specifies to round in the SECOND function resulting in range
> 0 to 60, and on the other hand specifies a range of 0 to 59 for the SECOND
> function. The other functions all truncate in ODF 1.3. So applications
> having implemented ODF 1.3 should produce the same results as in ODF 1.3,
> when they get an ODF 1.4 document without parameter. I think, that is a good
> backward compatibility.
While that is true in itself, at least two other major implementations, Excel and Gnumeric, don't implement the truncating ODF specification. And certainly Excel *never* would change its implementation just because ODF specified something different. So an interoperability problem already exists and for date+time 2023-12-31T23:59:59.6 their Excel behaviour is

YEAR: 2024
MONTH: 1
DAY: 1
HOUR: 0
MINUTE: 0
SECOND: 0

Obviously wrong but if they insist..


> How would you then solve the request to have functions, that behave like
> Excel?
Probably best would be to define the existing functions to rounding Excel behaviour for interoperability, and introduce new truncating functions YEAR.CLOCK(), MONTH.CLOCK(), ... (or whatever term might be appropriate instead of CLOCK).

Fwiw, Calligra Sheets does something very odd and somehow imports that fixed date+time value with the value of NOW() instead (and displays an empty cell, but in the input line displays the original value), but entered manually it truncates YEAR,MONTH,DAY (as specified by ODFF) and rounds HOUR,MINUTE,SECOND; so that's out of band anyway.
Comment 44 Eike Rathke 2023-09-25 12:54:02 UTC
(In reply to Albrecht Müller from comment #42)
> So if F is one of the YEAR, MONTH, DAY, HOUR, MINUTE, SECOND functions then
> in general the result of Microsoft’s version of F applied to some time value
> t would be the same as the result of ODF’s version of F applied to (t + ½
> second)?
Yes. (except SECOND() that per ODFF is already rounded).
Comment 45 Albrecht Müller 2023-09-25 15:33:22 UTC
(In reply to Eike Rathke from comment #44)
> (In reply to Albrecht Müller from comment #42)
> > So if F is one of the YEAR, MONTH, DAY, HOUR, MINUTE, SECOND functions then
> > in general the result of Microsoft’s version of F applied to some time value
> > t would be the same as the result of ODF’s version of F applied to (t + ½
> > second)?
> Yes. (except SECOND() that per ODFF is already rounded).

Just to be sure: If the SECOND function is implemented to always truncate as you proposed in comment #39 then this relation would also hold for the SECOND function. Right? I want to understand this clearly because I think that this half second is related to your comment #35.
Comment 46 Eike Rathke 2023-09-25 17:09:45 UTC
(In reply to Albrecht Müller from comment #45)
> Just to be sure: If the SECOND function is implemented to always truncate as
> you proposed in comment #39 then this relation would also hold for the
> SECOND function. Right?
Right.

> I want to understand this clearly because I think
> that this half second is related to your comment #35.
? I don't think so. My comment 35 is about floating point being an inadequate data type for date+time, as many if not most values (even in the seconds range), or rather all values that are not multiples of denominators that are a power of 2 (x/2, x/4, x/8, ...), can not be represented as distinct accurate binary floating point values, hence calculating with these values is even less accurate and accumulates round-off errors as with any floating-point calculation. Any final rounding is just an inaccurate workaround to calm the user and pretend that results would be accurate numbers, which they aren't.
Read material linked under https://erack.de/bookmarks/D.html#010203

The relation of the half second you mentioned is simply
truncate(x+0.5) == round(x)
Comment 47 Albrecht Müller 2023-09-26 09:33:53 UTC
(In reply to Eike Rathke from comment #46)
> ...
> > I want to understand this clearly because I think
> > that this half second is related to your comment #35.
> ? I don't think so. 
> ...

In your comment 35 I found:

> The real solution would be to get rid of that and use a proper DateTime
> type instead and convert to floating point only when needed, e.g. when
> exporting to Excel file formats or calculations require it or if a
> floating point display is requested.

I wondered what the purpose of using a “proper DateTime type” could be. I know that you think Microsoft’s way of doing date and time calculations is wrong and that you introduced a distinction between wall clock time and durations. Thus I thought that this type should fix some of the limitations of Microsoft’s date and time arithmetic, e.g. allow the handling of the transition between daylight saving time and standard time, support durations in the sense of ISO 8601 (see https://en.wikipedia.org/wiki/ISO_8601#Durations) etc. This was the motivation behind my comment 36 which you flagged as “off-topic”. My guess about your intentions was obviously wrong.

So I assume now that the concept behind your DateTime type is essentially the same as the idea of discrete time which I tried to explain in comment 37. The common point is the use of an arithmetic that allows to do exact calculations. The difference is that Microsoft always converts between floating point and the discrete time model. As a consequence you will never see Microsoft’s discrete time model – all you see are floating point values. The discrete time model is used “under the hood” only.

The critical point is how to convert between floating point values and your DateTime type or Microsoft’s discrete time model. It may be surprising that this conversion can be done quite reliably. The key to understand this is that the round off errors are usually pretty small and you probably will never see situations where these errors are equal to or greater than ½ second. If the round off errors stay below ½ second then rounding to the nearest second always finds the correct exact second.
Comment 48 Albrecht Müller 2023-09-30 12:12:54 UTC
(In reply to Regina Henschel from comment #40)
> ... 
> How would you then solve the request to have functions, that behave like
> Excel?

I fear that the answer to your question is: 

It is not possible to fix this problem just by adding a new parameter to some functions.

It is in the nature of the problem that you cannot fix it without changing the definition of some functions in a way that they have to return values that are different of those required by the current specification.

As I see it a solution requires to rewrite substantial parts of the specification that deal with date and time calculations. I don’t know if this is feasible. Here are some of my ideas:

The use of algorithms or datatypes for specification purposes should be avoided whenever possible. The reason is that algorithms and datatypes contain the wrong information. They describe one way how the output values can be calculated from the input values. This is an information that is important for implementations but not for the specification. The specification should state how the input values are related to the output values, what properties this relation has, design decisions etc.

The problems with the SECOND function illustrate what can go wrong when algorithms are used for specification purposes. Even if you fully understand how the algorithm works you don’t know for what purpose it rounds its input argument or why it should truncate it according Eike’s proposal. Algorithms may also specify unintended properties such as the return value of 60 in the case of the SECOND function, or the pseudo random results of the MINUTE function.

I think there is a need for a section that explains the general principles that are valid for (almost) _all_ date and time calculations within this specification including for example financial functions. The specifications of individual functions should refer to this section. I hope that this way all date and time related functions show a consistent behaviour and that the specifications get simpler. The specification of the SECOND function could simply state that this function returns the number of a second within a minute as defined is this section.

Here are some ideas about what should go into this section:

(1) Date and time values are represented by IEE 754 binary64 (i.e. 64 bit double) values.

(2) The number 1 represents one day. A day has 24 hours of equal length, an hour has 60 minutes and a minute 60 seconds. The units within a day are counted from 0 to 23 or from 0 to 59 respectively.

(3) Dates are represented by counting the days between some fixed start date (usually 1899-12-30, for compatibility reasons other dates may be used) and the represented date. Positive numbers represent dates after the start date. The Gregorian calender is used as in ISO 8601.

(4) Unless there are sound and documented reasons to do otherwise any calculation that deals with time quantities such as seconds, minutes, hours, days, weeks, month, years etc. uses an exact arithmetic that is based on integer multiples of seconds.

(5) Due to (1) and (4) there is a need to convert between the floating point representation and exact representations. This conversion is done by either rounding floating point values to the nearest exact second or by choosing the floating point value that has the least distance to the value of the exact second. Therefore date and time related functions round their arguments to the nearest second unless there are sound and documented reasons to do otherwise.

(6) A suitable range of date and time calculations should be specified. Maybe 1.1.1583 to 31.12.9999 would be ok.


The section should also give the reasons and limitations that are connected with these principles.

ad (1) Reason: You may get quite different results if you use a floating-point arithmetic of different precision. In order to get consistent results between different implementations it should be specified which arithmetic to use. 
Limitation: This excludes the full support of durations in the sense of ISO 8601 (https://en.wikipedia.org/wiki/ISO_8601#Durations) as you cannot represent time quantities of undefined length (e.g. month) using floating-point numbers alone.

ad (2) Reason: This corresponds to a common perception and allows easy calculations. 
Limitations: This design decision excludes date and time system that that have days, hours, minutes or seconds of varying length. Examples are systems that support switching between daylight saving time and standard time, contain leap seconds, or start days at sunset etc.

ad (4) Reason: As Eike pointed out many times floating point numbers are not suited for calculations with date and time. One of the reasons is that there is an inherent semantic ambiguity due to round off errors. Assume there is floating point value corresponding to 23:59:59.999. Does this value belong to the current day or does this represent 00:00:00.000 of the next day? This value could be the measurement of a clock with a millisecond resolution. Thus is value belongs to the current day. This value could also be the result of some calculation that due to round off errors is a millisecond less than the correct result of 00:00:00.000 of the next day. Therefore it belongs to the next day. For similar reasons value 00:00:00.000 may belong to the current or the previous day. I see no way how some algorithm could distinguish between these situations. That’s why an arithmetic should be used that is not disturbed by round off errors. This kind of arithmetic allows to truncate e.g. hours and minutes safely, i.e. without showing the pseudo random behaviour that results from applying truncation on raw floating point values.

ad (5) Reason: Provided that the round off errors stay somewhat below ½ second this method guarantees that round trip conversion from exact to float to exact yields the original exact value.
There may be certain functions that must work differently. If formatting some floating point value to a format that contains fractional seconds is required it makes no sense to round the input to the nearest second. In these cases the specification should give the reasons why the principles are violated and explain the consequences. In the formatting example one effect is that the year contained in the result of the formatting may differ from the result of the YEAR function. This corresponds to an effect seen in ordinary floating point representations. The most significant digit may depend on the rounding applied to the least significant digit: 9.99 becomes 10.0 if only one digit is used after the decimal separator.
Limitations: Rounding to the nearest second means that a new day will start at 23:59:59.5 which is somewhat counter-intuitive.

ad (6) Reason: Suitable limits are necessary as the time resolution of floating point values may vary by order of magnitudes. This depends e.g. from the distance of some date and time value to the start date according to (3). Some assumptions used in the design of the date and time calculations may not hold when the values become too large, e.g. you cannot round to the nearest second because the next floating point value is more than ½ second away. The lower limit is the first full year after the Gregorian reform to avoid complications caused by the transition between the Julian and Gregorian calendar. The upper limit is chosen as ISO 8601 uses 4 digit years. I think (but did not check this) that the precision according of the floating point arithmetic is enough to support this range.
Comment 49 Stéphane Guillou (stragu) 2024-01-29 04:51:07 UTC
*** Bug 159421 has been marked as a duplicate of this bug. ***