Bug 92503 - FILEOPEN: CSV - Certain date fields not imported correctly, pattern detected
Summary: FILEOPEN: CSV - Certain date fields not imported correctly, pattern detected
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:6.4.0 target:6.3.0.1 target:7.3.0
Keywords:
: 92892 (view as bug list)
Depends on:
Blocks: CSV-Import
  Show dependency treegraph
 
Reported: 2015-07-02 14:46 UTC by Stephan van den Akker
Modified: 2021-07-12 13:50 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
CSV example file containing identically formatted dates, some correctly imported and some not. (171 bytes, text/csv)
2015-07-02 14:46 UTC, Stephan van den Akker
Details
File sample with more dates to test (341 bytes, text/comma-separated-values)
2015-07-02 19:50 UTC, m_a_riosv
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stephan van den Akker 2015-07-02 14:46:16 UTC
Created attachment 117001 [details]
CSV example file containing identically formatted dates, some correctly imported and some not.

How to reproduce:
- Import the example CSV file (Insert -> Sheet from file)
- Set language to "English (UK)" or "English (US)"
- Import as single column (deselect all Separator options)
- Set the column type to "Date (DMY)"
- Press OK

Expected result: All rows are interpreted correctly as date values

Actual result: 3 rows are not imported as date values:
31 Mar 2013, 02:00
30 Mar 2014, 02:00
29 Mar 2015, 02:00

The offending pattern seems to be: (31-n) + " Mar " + (2013+n) + ", 02:00"

Other date fields with 02:00 seem unaffected.

Seen in:
- OpenOffice.org 3.3.0; OOO330m20 (Build:9567) on Windows 7 (64-bit)
- LibreOffice 4.4.3.2, Build ID: 40m0(Build:2), Locale: en_GB on openSuSE 13.1 (64-bit)
- LibreOffice 5.0.0.2.0+, Build ID: 2fa23d10c32f77da121ecf03f77ff3f10ca0d580, Locale: en-GB (en_GB.UTF-8) on openSuSE 13.1 (64-bit)
Comment 1 Buovjaga 2015-07-02 18:11:11 UTC
Confirmed. The problematic ones are imported as number.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+ (x64)
Build ID: 3a6ec53eeeec71312f5ea890689f9c2ee79c2aac
TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2015-07-01_02:24:40
Locale: fi-FI (fi_FI)
Comment 2 m_a_riosv 2015-07-02 19:50:50 UTC
Created attachment 117012 [details]
File sample with more dates to test

Adding more rows with a different month the issue change from hour 02:00 to year 2013.

31/03/13 01:00
31 Mar 2013, 02:00
31/03/13 03:00
30/03/14 01:00
30 Mar 2014, 02:00
30/03/14 03:00
29/03/15 01:00
29 Mar 2015, 02:00
29/03/15 03:00
31 Apr 2013, 01:00
31 Apr 2013, 02:00
31 Apr 2013, 03:00
30/04/14 01:00
30/04/14 02:00
30/04/14 03:00
29/04/15 01:00
29/04/15 02:00
29/04/15 03:00

using comma as separator, the problem remains with 2013 year.

31/03/13	 01:00
31/03/13	 02:00
31/03/13	 03:00
30/03/14	 01:00
30/03/14	 02:00
30/03/14	 03:00
29/03/15	 01:00
29/03/15	 02:00
29/03/15	 03:00
31 Apr 2013	 01:00
31 Apr 2013	 02:00
31 Apr 2013	 03:00
30/04/14	 01:00
30/04/14	 02:00
30/04/14	 03:00
29/04/15	 01:00
29/04/15	 02:00
29/04/15	 03:00
Comment 3 Eike Rathke 2015-07-08 21:03:35 UTC
This seems to be due to daylight saving time (DST) switches. In an en-US locale DST transition is on the last Sunday of March, which were 31-Mar-2013, 30-Mar-2014 and 29-Mar-2013. Technically there are no times between 02:00 and 02:59:59.999 on these dates.

Need to investigate why these DST switches apparently affect date recognition also in other locales.

@m.a.riosv:
Of course 31 Apr 2013 remains string because there is no April 31 ;-)
Comment 4 m_a_riosv 2015-07-08 22:11:40 UTC
> @m.a.riosv:
> Of course 31 Apr 2013 remains string because there is no April 31 ;-)

Pity, one day per year, they would be a couple of months in a life. :)
Comment 5 m_a_riosv 2015-07-24 14:22:51 UTC
*** Bug 92892 has been marked as a duplicate of this bug. ***
Comment 6 QA Administrators 2016-09-20 10:18:29 UTC Comment hidden (obsolete)
Comment 7 Stephan van den Akker 2016-09-20 13:31:25 UTC
Reproduced in LOdev:

Version: 5.3.0.0.alpha0+
Build ID: 075489b4b810692edc2ba9910eb3ca659a2b6745
CPU Threads: 4; OS Version: Linux 3.12; UI Render: default; 
Locale: en-GB (en_GB.UTF-8); Calc: group
Comment 8 QA Administrators 2018-07-21 02:40:28 UTC Comment hidden (obsolete)
Comment 9 Eike Rathke 2019-06-18 14:34:40 UTC
Took a more detailed look into this after a long time. Fun stuff.. or not. The calendar is always created with the time zone of the system as default, so for most Europeans something with CET (normal time) or CEST (summer daylight saving time). Then 2013-03-31 and the other two dates indeed were DST transition dates, there simply is no 02:00 to 02:59:59.999 for those days in Europe.

Not sure how to solve this except for all Gregorian based calendars use always the UTC time zone, which is not DST afflicted. It wouldn't fit the general assumption that times are entered (and thus displayed) in the local time zone, but then again there are no conversions to/from time zones or time zone aware calculations at all, all times are "as is".
Comment 10 Commit Notification 2019-06-19 23:57:54 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/942de6a01ba990e5f3bc55ce4ab3737a03f67f39%5E%21

Resolves: tdf#92503 introduce TimeZone to calendar loading and default to UTC

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2019-06-20 09:28:11 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/+/a9c02d543987a0c05beda19905ccd6fb4263b592%5E%21

Resolves: tdf#92503 introduce TimeZone to calendar loading and default to UTC

It will be available in 6.3.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2021-07-12 13:50:11 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/efda4f52ea59e2d235a55be863fbffa8ca7a7874

tdf#92503: sc: Add UItest

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.