Bug Hunting Session
Bug 50575 - ODBC sqlite3 timestamps as primary column: show empty data / crashes / ...
Summary: ODBC sqlite3 timestamps as primary column: show empty data / crashes / ...
Status: RESOLVED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
3.5.1 release
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on: 47520
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-01 02:03 UTC by Lionel Elie Mamane
Modified: 2012-07-11 16:47 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
screenshot, different result but still wrong (41.38 KB, image/png)
2012-07-11 16:47 UTC, Terrence Enger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lionel Elie Mamane 2012-06-01 02:03:45 UTC
+++ This bug was initially created as a clone of Bug #47520 +++

<original text by Heinz Repp>

Using a version newer than 0.85 of the sqliteodbc driver results in some of my tables showing only 4 empty rows, and this happened already with OpenOffice 3 and later with Libreoffice 3.3 and 3.4. Those tables have a timedate field as primary key, and Christian Werner had changed the formatting of timedates between 0.85 and 0.86 (millionth of seconds I believe) and told me, that in this case the the handling of Base was in error, and I should avoid using timedates as primary key, but that was no option. Now new with Libreoffice 3.5.1: when I try to open one of those tables with a timedate primary key Libreoffice reproducibly crashes.

<Comments by Lionel Elie Mamane>

The problem is that Base handles timestamps only up to a precision of hundredths of seconds. So when it tries to (re)load data in that row, it gets no result and shows an empty row. The crash is "just" a lack of robustness in the face of unexpected result (namely no data when data is expected). Also a bug, albeit (again) a different one.

The real solution is to change Base to handle timestamps up to the full ODBC precision of nanoseconds (billionths of a second).

This will still have problems when inserting a row with a Timestamp primary key, and the data inserted has a higher precision (in the primary key column) than what sqlite (or more generally the database engine) supports: Base won't find the newly inserted row again when trying to (re)load it, leading to the same problems described here for this new row.

More generally, this will happen for any primary key that contains data that can get rounded/truncated by the database engine, and such primary keys will "always" be fragile I guess (unless we find a better way to refer to a row than by its primary key...).

Steps to reproduce:
See bug #47520


Notes for solving:

The limitation to hundredths of a second comes from :: com :: sun :: star :: util :: struct DateTime, which (as far as I understand) is part of the UNO "guaranteed stable" ABI. As such, we'd have to introduce a new ::com::sun::star::util::struct DateTime2 or some such with higher precision. Or maybe a more "base-private" ::com::sun::star::sdb(c)::DateTime? Nah, probably too confusing.

Then all places using ::com::sun::star::util::DateTime have to be changed... The ODBC scaling factor (now ODBC_FRACTION_UNITS_PER_HSECOND) should ideally be 1, etc. That is not only in database-related stuff, so this will have to be a concerted effort of all/several components.

Funnily enough, LibreOffice already has a (non-UNO exported) datatype for timestamp with nanosecond precision, namely oslDateTime (in sal/inc/osl/time.h).
Comment 1 Heinz Repp 2012-06-06 08:36:17 UTC
Thank you for picking this up! My thoughts about it:

There is nothing wrong with Base and ODBC having different timestamp definitions - after all, Base strives to be able to use many different sources and can not adapt its internal data types to every possible source. So, leaving Base's timestamp precision at hundredths of a second seems reasonable to me. What is needed is a suitable translation layer from driver data types to Base data types and vice versa.

The ODBC specification is pretty clear in this aspect. The C Data Types, those that are passed to Base, have ( http://msdn.microsoft.com/en-us/library/windows/desktop/ms714556%28v=vs.85%29.aspx ):

SQL_C_TYPE_TIMESTAMP[c]
	
SQL_TIMESTAMP_STRUCT	

struct tagTIMESTAMP_STRUCT {
   SQLSMALLINT year;
   SQLUSMALLINT month;
   SQLUSMALLINT day;
   SQLUSMALLINT hour;
   SQLUSMALLINT minute;
   SQLUSMALLINT second;
   SQLUINTEGER fraction;[b] 
} TIMESTAMP_STRUCT;[a]

with note [b]:  The value of the fraction field is the number of billionths of a second and ranges from 0 through 999,999,999 (1 less than 1 billion). For example, the value of the fraction field for a half-second is 500,000,000, for a thousandth of a second (one millisecond) is 1,000,000, for a millionth of a second (one microsecond) is 1,000, and for a billionth of a second (one nanosecond) is 1.

--- end of ODBC specs ---

So _every_ ODBC driver, not just sqlite[3]odbc, is expected to deliver and receive nanoseconds.

The wrapper between Base and ODBC has to:
- multiply hundredths of a second with 10^7 to be stored in struct tagTIMESTAMP_STRUCT.fraction
- divide by 10^7 when reading tagTIMESTAMP_STRUCT.fraction to receive hundredths of seconds. This way all values written by Base can be read and used without issue.

Problems arise only with third party data that are more precise than Base's internal data type. This can happen with any database driver, not just ODBC. When those data are used as primary key, Base has no way to address one specific row because its internal representation lacks essential data. This cannot be solved.

In this bug the situation is much simpler. The timestamps in my tables are have no fractions of seconds, so the fraction structure element is always 0. My guess is, that Base is failing in translating ODBC timestamps to Base timestamps and/or vice versa.
Comment 2 Lionel Elie Mamane 2012-06-07 06:57:47 UTC
(In reply to comment #1)

> The wrapper between Base and ODBC has to:
> - multiply hundredths of a second with 10^7 to be stored in struct
> tagTIMESTAMP_STRUCT.fraction
> - divide by 10^7 when reading tagTIMESTAMP_STRUCT.fraction to receive
> hundredths of seconds. This way all values written by Base can be read and used
> without issue.

AFAIK, it does that.

> Problems arise only with third party data that are more precise than Base's
> internal data type. This can happen with any database driver, not just ODBC.
> When those data are used as primary key, Base has no way to address one
> specific row because its internal representation lacks essential data. This
> cannot be solved.

Well, going to nanosecond precision in Base should cover nearly all cases. I think most databases don't have timestamp resolution finer than nanoseconds :)

> In this bug the situation is much simpler. The timestamps in my tables are have
> no fractions of seconds, so the fraction structure element is always 0. My
> guess is, that Base is failing in translating ODBC timestamps to Base
> timestamps and/or vice versa.

Oh, your timestamps have no fractional part? Then indeed, there is another  bug somewhere. I'll try to take a look.
Comment 3 Lionel Elie Mamane 2012-06-07 09:57:57 UTC
(In reply to comment #2)
> (In reply to comment #1)

>> In this bug the situation is much simpler. The timestamps in my tables are have
>> no fractions of seconds, so the fraction structure element is always 0. My
>> guess is, that Base is failing in translating ODBC timestamps to Base
>> timestamps and/or vice versa.

> Oh, your timestamps have no fractional part? Then indeed, there is another  bug
> somewhere. I'll try to take a look.

That's *definitely* a sqlite or sqliteodbc bug. LibreOffice calls SQLBindParameter with a buffer that contains the right structure with the right values. Then in gdb I traced the calls to the SQLite3 API. I see that sqliteodbc calls:


Breakpoint 12, sqlite3_bind_text (pStmt=0x169e088, i=2, zData=0x1a33a10 "2012-04-06 12:34:56.000", nData=-1, xDel=0) at sqlite3.c:59141

*but* compare the following in sqlite3 command-line interface:

sqlite> SELECT * FROM byTs WHERE ts='2012-04-05 13:45:57.000';
sqlite> SELECT * FROM byTs WHERE ts='2012-04-05 13:45:57';
nr    ts             word
----  -------------  ----
2     2012-04-05 13:45:57  Thursday


See? SQLite seems not to support the fractions (subsecond) syntax, but sqliteodbc uses it *anyway*.
Comment 4 Terrence Enger 2012-06-07 11:54:13 UTC
But sqlite 3 *does* support fractional seconds in at least some
contexts.  From a terminal session, first the command interpreter that
comes with sqlite3 ...

    $ sqlite3 ../bug_047520/bug_047520.db 
    SQLite version 3.7.4
    Enter ".help" for instructions
    Enter SQL statements terminated with a ";"
    sqlite> .mode columns
    sqlite> select * from byTs ;
    1           2012-04-06 12:34:56.654321  Friday    
    2           2012-04-05 13:45:57.123456  Thursday  
    sqlite> 

and then the command interpreter from unixodbc ...

    $ isql bug_047520
    +---------------------------------------+
    | Connected!                            |
    |                                       |
    | sql-statement                         |
    | help [tablename]                      |
    | quit                                  |
    |                                       |
    +---------------------------------------+
    SQL> select * from byTs
    +-----------+---------------------------------+-----------+
    | nr        | ts                              | word      |
    +-----------+---------------------------------+-----------+
    | 1         | 2012-04-06 12:34:56.654321      | Friday    |
    | 2         | 2012-04-05 13:45:57.123456      | Thursday  |
    +-----------+---------------------------------+-----------+
    SQLRowCount returns 2
    2 rows fetched
    SQL> 


"The timestamps in my tables are have no fractions of seconds" is
Heinz Repp talking about *his* data, not the data that I offered in
<https://bugs.freedesktop.org/attachment.cgi?id=59688>.


<asides>

  *Hoping* to be more helpful than a nuisance, ...

  (*) It is possible to retrieve the data successfully by disguising the
      key column:

          select nr, cast( "ts" as char(26) ), word from byTs

      Still, I suppose there must be *some* reason for the existence
      of class OKeySet.

  (*) One entry on my list of things to try (back before I got stuck in
      build difficulties) was a table keyed by a DECIMAL column with
      non-zero precision.  I may get back to that someday.

  (*) Bug 50849 "SIGSEGV: ODBC to PostgreSQL, renaming column in SELECT
      list" shows another case of LO plus ODBC giving results different
      from isql.

</asides>
Comment 5 Lionel Elie Mamane 2012-06-07 12:45:03 UTC
(In reply to comment #4)
> But sqlite 3 *does* support fractional seconds in at least some
> contexts.  From a terminal session, first the command interpreter that
> comes with sqlite3 ...
> 
>     $ sqlite3 ../bug_047520/bug_047520.db 
>     SQLite version 3.7.4
>     Enter ".help" for instructions
>     Enter SQL statements terminated with a ";"
>     sqlite> .mode columns
>     sqlite> select * from byTs ;
>     1           2012-04-06 12:34:56.654321  Friday    
>     2           2012-04-05 13:45:57.123456  Thursday  
>     sqlite> 

1) When putting a timestamp through a parameter of a parametric query (as LibreOffice does), sqliteodbc will use exactly three positions after the decimal dot, so will issue something like "SELECT * from byTs WHERE ts='2012-04-05 13:45:57.123'". So it won't find your data because of truncation issues. That's a bug in sqliteodbc. Even if this bug were corrected in sqliteodbc, it still wouldn't work, because LibreOffice truncates data to *two* positions after decimal dot. That's an imperfection in LibreOffice.

2) Now, try this in sqlite3 prompt:

   SELECT * FROM byTs WHERE ts='2012-04-05 13:45:57.1234560'

   ---> no row found, while it should find the second row.

   INSERT INTO byTs VALUES (3, '2012-04-04 15:20:22.980', 'Wednesday');  
   SELECT * FROM ByTs WHERE ts='2012-04-04 15:20:22.98';

   ---> no row found, while it should find the just inserted row, as 22.980 == 22.98 when understood as numbers!!!

   Similarly

   INSERT INTO byTs VALUES (3, '2012-04-04 15:20:22.98', 'Wednesday');  
   SELECT * FROM ByTs WHERE ts='2012-04-04 15:20:22.980';

   That's a bug in sqlite3.




>     $ isql bug_047520
>     +---------------------------------------+
>     | Connected!                            |
>     |                                       |
>     | sql-statement                         |
>     | help [tablename]                      |
>     | quit                                  |
>     |                                       |
>     +---------------------------------------+
>     SQL> select * from byTs
>     +-----------+---------------------------------+-----------+
>     | nr        | ts                              | word      |
>     +-----------+---------------------------------+-----------+
>     | 1         | 2012-04-06 12:34:56.654321      | Friday    |
>     | 2         | 2012-04-05 13:45:57.123456      | Thursday  |
>     +-----------+---------------------------------+-----------+
>     SQLRowCount returns 2
>     2 rows fetched
>     SQL> 
> 
> 
> "The timestamps in my tables are have no fractions of seconds" is
> Heinz Repp talking about *his* data, not the data that I offered in
> <https://bugs.freedesktop.org/attachment.cgi?id=59688>.
> 
> 
> <asides>
> 
>   *Hoping* to be more helpful than a nuisance, ...
> 
>   (*) It is possible to retrieve the data successfully by disguising the
>       key column:
> 
>           select nr, cast( "ts" as char(26) ), word from byTs
> 
>       Still, I suppose there must be *some* reason for the existence
>       of class OKeySet.
> 
>   (*) One entry on my list of things to try (back before I got stuck in
>       build difficulties) was a table keyed by a DECIMAL column with
>       non-zero precision.  I may get back to that someday.
> 
>   (*) Bug 50849 "SIGSEGV: ODBC to PostgreSQL, renaming column in SELECT
>       list" shows another case of LO plus ODBC giving results different
>       from isql.
> 
> </asides>
Comment 6 Lionel Elie Mamane 2012-06-07 12:58:19 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > But sqlite 3 *does* support fractional seconds in at least some
> > contexts.  From a terminal session, first the command interpreter that
> > comes with sqlite3 ...
> > 
> >     $ sqlite3 ../bug_047520/bug_047520.db 
> >     SQLite version 3.7.4
> >     Enter ".help" for instructions
> >     Enter SQL statements terminated with a ";"
> >     sqlite> .mode columns
> >     sqlite> select * from byTs ;
> >     1           2012-04-06 12:34:56.654321  Friday    
> >     2           2012-04-05 13:45:57.123456  Thursday  
> >     sqlite> 
> 
> 1) When putting a timestamp through a parameter of a parametric query (as
> LibreOffice does), sqliteodbc will use exactly three positions after the
> decimal dot, so will issue something like "SELECT * from byTs WHERE
> ts='2012-04-05 13:45:57.123'". So it won't find your data because of truncation
> issues. That's a bug in sqliteodbc. Even if this bug were corrected in
> sqliteodbc, it still wouldn't work, because LibreOffice truncates data to *two*
> positions after decimal dot. That's an imperfection in LibreOffice.
> 
> 2) Now, try this in sqlite3 prompt:
> 
>    SELECT * FROM byTs WHERE ts='2012-04-05 13:45:57.1234560'
> 
>    ---> no row found, while it should find the second row.
> 
>    INSERT INTO byTs VALUES (3, '2012-04-04 15:20:22.980', 'Wednesday');  
>    SELECT * FROM ByTs WHERE ts='2012-04-04 15:20:22.98';
> 
>    ---> no row found, while it should find the just inserted row, as 22.980 ==
> 22.98 when understood as numbers!!!
> 
>    Similarly
> 
>    INSERT INTO byTs VALUES (3, '2012-04-04 15:20:22.98', 'Wednesday');  
>    SELECT * FROM ByTs WHERE ts='2012-04-04 15:20:22.980';
> 
>    That's a bug in sqlite3.

Compare with:

  SELECT * FROM ByTs WHERE nr=2.0;

Which correctly finds the second row.

Also consider:

 INSERT INTO byTs VALUES (4, '2012-04-04 15:20:22.98', 'Wednesday');

This should fail, since there is already a row with ts value "4 april 2012, three pm, twenty minutes, twenty-two seconds and 98 hundredths of a second" which is the same as "4 april 2012, three pm, twenty minutes, twenty-two seconds and 980 thousandths of a second" .

> >   (*) It is possible to retrieve the data successfully by disguising the
> >       key column:
> > 
> >           select nr, cast( "ts" as char(26) ), word from byTs

Yes, because then you tell LibreOffice and sqliteodbc not to treat data from this column as a timestamp, but as a string, whithout normalising it. For *strings*, '2012-04-04 15:20:22.98' is not the same string as '2012-04-04 15:20:22.980', but for *timestamps* it is the *same* timestamp. sqlite3 seems to treat them more as strings than as timestamps. Ah yes, look:

sqlite> INSERT INTO byTs VALUES (5, 'not a timestamp', 'Wednesday');
sqlite> SELECT * FROM ByTs WHERE nr=5;
nr    ts             word
----  -------------  ----
5     not a timestamp  Wednesday

The insert should have failed, because the value given to ts column is not a valid timestamp!

So as far as I see sqlite DOES NOT CORRECTLY SUPPORT TIMESTAMPS, and sqliteodbc/sqlite ARE LYING to LibreOffice when saying this column is a timestamp. They should just declare it as (unbounded) string, because that's how sqlite treats it...
Comment 7 Terrence Enger 2012-06-07 13:56:24 UTC
From comment 6 ...
> So as far as I see sqlite DOES NOT CORRECTLY SUPPORT TIMESTAMPS, and


But more than that, sqlite does not enforce any data type (except an integer
primary key).  <http://www.sqlite.org/datatype3.html> "Datatypes In SQLite
Version 3" calls it a dynamic type system:

    In SQLite, the datatype of a value is associated with the value itself, not
    with its container.

The datatype of a column is merely advisory.  "Dynamic" is too pleasant a word;
I am tempted to call this "unreliable" and "broken".


This laxness makes me question the wisdom of bug 38811 "default to SQLite not
HSQLDB in Base".  Indeed, had I known earlier about the laxness, perhaps I would
never have bothered to comment on bug 47520.  Just call me a grumpy old man.
Comment 8 Heinz Repp 2012-06-07 14:27:14 UTC
It is true that Sqlite has no datatype for timestamps (or any date or time data). The page mentioned by Terrence says:

> SQLite does not have a storage class set aside for storing dates and/or times.
> Instead, the built-in Date And Time Functions of SQLite are capable of storing
> dates and times as TEXT, REAL, or INTEGER values:
> 
> - TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS").
> - REAL as Julian day numbers, the number of days since noon in Greenwich on
> November 24, 4714 B.C. according to the proleptic Gregorian calendar.
> - INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC.

I could not find any hint which datatype Sqlite chooses when creating a table with timestamp columns, but it seems they all are strings:

> sqlite> select typeof(ts) from byTs;
> text

In my case, Sqlite lists my timestamps with one decimal, as in '2008-01-11 19:43:50.0', presumable the format that a previous version of sqliteodbc had written, and as you confirmed the new versions feed more decimal places to the sqlite library and get an empty result as this is a different string.

Seems what I need is not achievable with sqlite/sqliteodbc: in the input form I need valid timestamps (and correct conversion when e.g. pasting from Calc), but the database needs strings with exact match. I would have to write my own conversion function.
Comment 9 Terrence Enger 2012-06-08 05:09:41 UTC
@Lionel, with respect to comment 6

My observation about hiding "ts" inside a cast expression was just to express my
wonderment at all the work that OKeySet does, given that LO appears to work on a
table without any key at all.  A query like ...

    select * from ( select nr, cast( "ts" as char(26) ), word from byTs )

would hide the key even deeper; it shows the data successfully.


Still, this is merely an "aside" to the reported problem.
Comment 10 Terrence Enger 2012-07-11 16:47:47 UTC
Created attachment 64119 [details]
screenshot, different result but still wrong

( Yes, I know that this bug is RESOLVED NOTOURBUG, but I had already started to prepare this comment. )


In the environment described in the description of fdo#51987 (short version: Linux, master pulled yesterday), my results are changed but still wrong.

First, the results from isql, which I take to be right:

    $ isql bug_047520
    +---------------------------------------+
    | Connected!                            |
    |                                       |
    | sql-statement                         |
    | help [tablename]                      |
    | quit                                  |
    |                                       |
    +---------------------------------------+
    SQL> select * from byTs
    +-----------+---------------------------------+-----------+
    | nr        | ts                              | word      |
    +-----------+---------------------------------+-----------+
    | 1         | 2012-04-06 12:34:56.654321      | Friday    |
    | 2         | 2012-04-05 13:45:57.123456      | Thursday  |
    | 3         | 2012-04-04 15:20:22.980         | Wednesday |
    | 4         | 2012-04-04 15:20:22.98          | Wednesday |
    +-----------+---------------------------------+-----------+
    SQLRowCount returns 4
    4 rows fetched

But Base shows rows with nr 1, 2, and 2, and the bottom of the Table
Data View indicates that the table has three rows.  Attaching
screenshot of the Table Data View.


@LEM ...

I have built the driver, sqliteodbc-0.87, with debug symbols to make
it easier to look into this bug.  But the "dynamic typing" used by
sqlite makes me doubt the possibilty of a robust solution.  And now I
see that fdo#38811 "default to SQLite not HSQLDB in Base" has been
declined; I wish I had been following that bug report.  These two
"downers" just about kill my interest, unless you want to tell me
otherwise.