Bug Hunting Session
Bug 50849 - SIGSEGV: ODBC to PostgreSQL, renaming column in SELECT list
Summary: SIGSEGV: ODBC to PostgreSQL, renaming column in SELECT list
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
Master old -3.6
Hardware: Other All
: medium normal
Assignee: Lionel Elie Mamane
URL:
Whiteboard: target:3.7.0 target:3.6.0.1 target:3.5.6
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-07 10:12 UTC by Terrence Enger
Modified: 2012-08-22 15:29 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
gdb output, including backtrace (8.47 KB, text/plain)
2012-06-07 10:12 UTC, Terrence Enger
Details
proposed patch (945 bytes, text/plain)
2012-06-07 13:52 UTC, Julien Nabet
Details
backtrace withs pgodbc debugging symbols (3.11 KB, text/plain)
2012-07-04 10:48 UTC, Lionel Elie Mamane
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Terrence Enger 2012-06-07 10:12:30 UTC
Created attachment 62740 [details]
gdb output, including backtrace

summary
-------

BASE SIGSEGV:  ODBC to PostgreSQL, renaming column in SELECT list

description
-----------

Having a ODBC connection to a PostgreSQL database with data created by
the SQL statements that Heinz Repp provided in
https://bugs.freedesktop.org/show_bug.cgi?id=47520#c9, I tried in
Queries > "Create Query in SQL View ..." to run the query

    select number x, name from table1

The result is a SIGSEGV.  Backtrace attached.

For comparison:  using an explicit "as" in the rename avoids the crash
and results in the the expected display of the table contents.


A SIGSEGV also comes using LibreOffice supplied with ubuntu-natty
(11.04), which identifies itself as

    LibreOffice 3.3.4 
    OOO330m19 (Build:401)
    tag libreoffice-3.3.3.1, Ubuntu package 1:3.3.4-0ubuntu1


I have left default priority on this report, but perhaps the priority
should be lower, considering the age of this crash and the fact that I
was deliberately "looking for trouble".


My LibreOffice is master commit id 0752710, pulled around 2012-06-06
12:50 GMT, configured with

    --disable-mozilla
    --enable-symbols
    --enable-dbgutil
    --enable-crashdump
    --disable-build-mozilla
    --without-system-postgresql
    --enable-debug

built and running on ubuntu-natty (11.04).  Other possibly implicated
software includes

    package          version
    ---------------  ------------------- 
    unixodbc         2.2.14p2-2ubuntu1
    postgresql       8.4.12-0ubuntu11.04
    postgresql-8.4   8.4.12-0ubuntu11.04
    odbc-postgresql  1:08.03.0200-1.2

The relevant stanza from ~/.odbc.ini reads

    [psql_terry_ansi]
    Description		= PostgreSQL db terry, ansi version
    Driver		= PostgreSQL_ansi
    Trace		= Yes
    TraceFile		= /tmp/sql.log
    Database		= terry
    Servername		= localhost
    Username		= [redacted]
    Password		= [redacted]
    Port		= 5432
    Protocol		= 6.4
    ReadOnly		= No
    RowVersioning		= No
    ShowSystemTables		= No
    ShowOidColumn		= No
    FakeOidIndex		= No
    ConnSettings		=
Comment 1 Terrence Enger 2012-06-07 10:48:41 UTC
But sqlite 3 *does* support fractional seconds in at least some
contexts.  From a terminal session, first the command interpreter that
comes with sqlite3 ...

    $ sqlite3 ../bug_047520/bug_047520.db 
    SQLite version 3.7.4
    Enter ".help" for instructions
    Enter SQL statements terminated with a ";"
    sqlite> .mode columns
    sqlite> select * from byTs ;
    1           2012-04-06 12:34:56.654321  Friday    
    2           2012-04-05 13:45:57.123456  Thursday  
    sqlite> 

and then the command interpreter from unixodbc ...

    $ isql bug_047520
    +---------------------------------------+
    | Connected!                            |
    |                                       |
    | sql-statement                         |
    | help [tablename]                      |
    | quit                                  |
    |                                       |
    +---------------------------------------+
    SQL> select * from byTs
    +-----------+---------------------------------+-----------+
    | nr        | ts                              | word      |
    +-----------+---------------------------------+-----------+
    | 1         | 2012-04-06 12:34:56.654321      | Friday    |
    | 2         | 2012-04-05 13:45:57.123456      | Thursday  |
    +-----------+---------------------------------+-----------+
    SQLRowCount returns 2
    2 rows fetched
    SQL> 


"The timestamps in my tables are have no fractions of seconds" is
Heinz Repp talking about *his* data, not the data that I offered in
<https://bugs.freedesktop.org/attachment.cgi?id=59688>.


<asides>

  *Hoping* to be more helpful than a nuisance, ...

  (*) It is possible to retrieve the data successfully by disguising the
      key column:

          select nr, cast( "ts" as char(26) ), word from byTs

      Still, I suppose there must be *some* reason for the existence
      of class OKeySet.

  (*) One entry on my list of things to try (back before I got stuck in
      build difficulties) was a table keyed by a DECIMAL column with
      non-zero precision.  I may get back to that someday.

  (*) Bug 50849 "SIGSEGV: ODBC to PostgreSQL, renaming column in SELECT
      list" shows another case of LO and ODBC giving results different
      from isql.

</asides>
Comment 2 Terrence Enger 2012-06-07 11:56:08 UTC
Whoops.  The comment 1 that I posted here belongs to bug 50575 "timestamps as primary column: show empty data / crashes / ...".   Please accept my apology for the noise.
Comment 3 Julien Nabet 2012-06-07 13:52:11 UTC
Created attachment 62764 [details]
proposed patch

I didn't test this proposed patch but took a look at the bt and to this odbc specs link :
http://msdn.microsoft.com/en-us/library/windows/desktop/ms714602%28v=vs.85%29.aspx

I wondered if the segfault could be caused by a NULL value for pPKQ (CatalogName) or pPKO (SchemaName)
It seems these vars can't be NULL but instead could be "".

Terrence: Could you test this simple (perhaps "naive") patch ?

(I use master sources on pc Debian x86-64 (last commit retrieved today 1cdfe7519a94d35f3fb47eed3331397cc11129b5)
Comment 4 Terrence Enger 2012-06-07 15:28:08 UTC
Thank you, Juiien, for the proposed patch.  I shall try to apply it when my
current build-in-process finishes.

The patch looks plausible because I saw schema and table both had len=0 upon
entry into ODatabaseMetaDataResultSet::openSpecialColumns.
Comment 5 Julien Nabet 2012-06-07 22:42:24 UTC
line 1149 and 1150 of the same file, I wonder too why length is equal to SQL_NTS.

In file unixODBC/inc/sql.h, line 45 gives :
#define SQL_NTS                   (-3)

The thing  is SQL_NTS is used in many functions so perhaps I missed something.
Comment 6 Lionel Elie Mamane 2012-06-08 00:34:19 UTC
(In reply to comment #5)
> line 1149 and 1150 of the same file, I wonder too why length is equal to
> SQL_NTS.

SQL_NTS means 'data is a NULL-terminated string; I'm not telling you the length, just consume it until you encounter a NULL.
Comment 7 Lionel Elie Mamane 2012-07-04 10:47:05 UTC
(In reply to comment #3)
> Created attachment 62764 [details]
> proposed patch

Nope, does not solve crash.

Frankly, I suspect an ODBC driver bug.
Comment 8 Lionel Elie Mamane 2012-07-04 10:48:18 UTC
Created attachment 63816 [details]
backtrace withs pgodbc debugging symbols
Comment 9 Lionel Elie Mamane 2012-07-04 10:58:34 UTC
(In reply to comment #7)
> Frankly, I suspect an ODBC driver bug.

The segfault is at line:

	if (SQL_SUCCESS == ret && 0 == QR_get_num_total_tuples(SC_get_Result(stmt))) 

in functtion

RETCODE		SQL_API
SQLSpecialColumns(HSTMT StatementHandle,
				  SQLUSMALLINT IdentifierType, SQLCHAR *CatalogName,
				  SQLSMALLINT NameLength1, SQLCHAR *SchemaName,
				  SQLSMALLINT NameLength2, SQLCHAR *TableName,
				  SQLSMALLINT NameLength3, SQLUSMALLINT Scope,
				  SQLUSMALLINT Nullable)

in file odbcapi.c (in pgodbc sources).

In statement.h:#define SC_get_Result(a)  (a->result)

But:

(gdb) print stmt->result
$4 = (QResultClass *) 0x0


Here's your NULL pointer!
Comment 10 Terrence Enger 2012-07-10 01:38:07 UTC
My references to the ODBC driver are to psqlodbc-09.01.0100.


It is the ODBC driver which dereferences the null pointer, and the ODBC
driver may be the underlying cause.  Still, I suggest that between the
cause and the dereference, LibreOffice has at least helped the ODBC
driver to fail by executing a "funny" call.

In the segfaulting case, ODatabaseMetDataResultSet::openSpecialColumns,
ODatabaseMetaDataResultSet.cxx:1139, calls SQLSpecialColumns with null
string and zero for parameters 7 and 8, TableName and TableNameSize.
For comparison, in the working case, LibreOffice passes a good table
name (and good catalog name and schema name, too).
<msdn.microsoft.com/en-us/library/windows/desktop/ms714602(v=vs.85).aspx>
does not mention the possibility of the table name being the null
string.  I cannot imagine how one could expect a good result from the
null string.

Does this suggest that it would be good to look into why LO is doing the
call with null table name?  Or into why LO is doing this extra call at
all?


A funny thing happened on the way to the crash: In PGAPI_SpecialColumns,
info.c line 2783 detects the null table name, so line 2785 does
SC_set_error and then line 2786 returns SQL_SUCCESS.  SQL_SUCCESS sounds
to me like an outright lie.  If in SQLSPecialColumns I fudge the return
value to SQL_ERROR, then LibreOffice ...

(*) Goes on to call SQLSpecialColumns with good catalog, schema, and
    table names.  This call looks very much like what I remember from
    the successful query.

(*) Goes on to display the table data.

The SQL_SUCCESS sounds to me like a candidate for reporting upstream, as
does the segfault itself.  LEM, do you agree?
Comment 11 Lionel Elie Mamane 2012-07-10 05:35:52 UTC
(In reply to comment #10)

> It is the ODBC driver which dereferences the null pointer, and the ODBC
> driver may be the underlying cause.  Still, I suggest that between the
> cause and the dereference, LibreOffice has at least helped the ODBC
> driver to fail by executing a "funny" call.

> In the segfaulting case, ODatabaseMetDataResultSet::openSpecialColumns,
> ODatabaseMetaDataResultSet.cxx:1139, calls SQLSpecialColumns with null
> string and zero for parameters 7 and 8, TableName and TableNameSize.

> Does this suggest that it would be good to look into why LO is doing the
> call with null table name?  Or into why LO is doing this extra call at
> all?

Ah yes, I had missed that; I was so focused on the CatalogName and SchemaName NULL vs empty string.

Fair enough; LibO should not call it with empty string for TableName, and this should be fixed.

> A funny thing happened on the way to the crash: In PGAPI_SpecialColumns,
> info.c line 2783 detects the null table name, so line 2785 does
> SC_set_error and then line 2786 returns SQL_SUCCESS.  SQL_SUCCESS sounds
> to me like an outright lie.  If in SQLSPecialColumns I fudge the return
> value to SQL_ERROR, then LibreOffice ...

> (*) Goes on to call SQLSpecialColumns with good catalog, schema, and
>     table names.  This call looks very much like what I remember from
>     the successful query.

> (*) Goes on to display the table data.

> The SQL_SUCCESS sounds to me like a candidate for reporting upstream, as
> does the segfault itself.  LEM, do you agree?

Yes, the segfault, with the explanation about SQL_SUCCESS/SQL_ERROR should be reported upstream. Could you do that (and CC me)? Thanks.
Comment 12 Terrence Enger 2012-07-10 13:06:47 UTC
(1) I have reported the segfault to the psql-odbc list.  The thread
    archive starts at
    <http://archives.postgresql.org/pgsql-odbc/2012-07/msg00000.php>.

(2) After I fudged the return value from PGAPI_SpecialColumns to avoid the
    segfault, the column heading in the data display correctly showed the
    rename that I specified in the SELECT list.  This is another reason to
    question whether we need the "funny" call to SQLSpecialColumns at all.
Comment 13 Lionel Elie Mamane 2012-07-10 14:13:13 UTC
(In reply to comment #11)
> (In reply to comment #10)

>> Does this suggest that it would be good to look into why LO is doing the
>> call with null table name?  Or into why LO is doing this extra call at
>> all?
 
> Ah yes, I had missed that; I was so focused on the CatalogName and SchemaName
> NULL vs empty string.

> Fair enough; LibO should not call it with empty string for TableName, and this
> should be fixed.

This *also* has its roots in an ODBC driver problem. The structure is as such:

For each column in the query, LibO:

 - asks the driver "which table does this column come from" by calling

   SQLColAttribute(m_aStatementHandle,
                   1,
ident,                                                         |
   |75                                          (SQLPOINTER)pName,                                                           |
   |76                                          nRealLen,                                                                    |
   |77                                          &nRealLen,                                                                   |
   |78                                          NULL                                                                         |
   |79                                          );                                                                           |

 - store the answer in sTableName
 - calls SQLSpecialColumns(..., sTablename, ...)

> 
> > A funny thing happened on the way to the crash: In PGAPI_SpecialColumns,
> > info.c line 2783 detects the null table name, so line 2785 does
> > SC_set_error and then line 2786 returns SQL_SUCCESS.  SQL_SUCCESS sounds
> > to me like an outright lie.  If in SQLSPecialColumns I fudge the return
> > value to SQL_ERROR, then LibreOffice ...
> 
> > (*) Goes on to call SQLSpecialColumns with good catalog, schema, and
> >     table names.  This call looks very much like what I remember from
> >     the successful query.
> 
> > (*) Goes on to display the table data.
> 
> > The SQL_SUCCESS sounds to me like a candidate for reporting upstream, as
> > does the segfault itself.  LEM, do you agree?
> 
> Yes, the segfault, with the explanation about SQL_SUCCESS/SQL_ERROR should be
> reported upstream. Could you do that (and CC me)? Thanks.
Comment 14 Lionel Elie Mamane 2012-07-10 14:15:20 UTC
(In reply to comment #11)
> (In reply to comment #10)

>> Does this suggest that it would be good to look into why LO is doing the
>> call with null table name?  Or into why LO is doing this extra call at
>> all?

> Ah yes, I had missed that; I was so focused on the CatalogName and SchemaName
> NULL vs empty string.

> Fair enough; LibO should not call it with empty string for TableName, and this
> should be fixed.

This *also* has its roots in an ODBC driver problem. The structure is as such:

For each column in the query, LibO:

 - asks the driver "which table does this column come from" by calling

   SQLColAttribute(m_aStatementHandle,
                   1,
                   15,
                   ...);

 - store the answer in sTableName

 - calls SQLSpecialColumns(..., sTablename, ...)

I'll add a sanity check for emptiness in the call to SQLSpecialColumns, but that *also* needs fixing in the ODBC driver. I'll email pgsql-odbc in the same thread you did.
Comment 15 Not Assigned 2012-07-10 15:27:29 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1c67fc8015d88a383f2138c4d03d692314d6f579

fdo#50849 work around psqlodbc segfault
Comment 16 Lionel Elie Mamane 2012-07-10 15:31:13 UTC
Asked for review to backport of work-around to LibO 3.5 branch. May take a few days.
Comment 17 Not Assigned 2012-07-10 15:34:07 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=00d453b9332d0f5f3ab64d40186b68b4b3bb6d5e&g=libreoffice-3-6

fdo#50849 work around psqlodbc segfault


It will be available in LibreOffice 3.6.
Comment 18 Not Assigned 2012-07-10 16:05:18 UTC
Lionel Elie Mamane committed a patch related to this issue.
It has been pushed to "libreoffice-3-5":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=56ef010aef4db3515f713ad9a2b587fb8402cf2b&g=libreoffice-3-5

fdo#50849 work around psqlodbc segfault


It will be available in LibreOffice 3.5.6.
Comment 19 Terrence Enger 2012-07-11 02:48:54 UTC
Thank you.  I have built master commit a3a8b81 (pulled around 2012-07-10 17:45
UTC), and the segfault is gone with driver versions 1:08.03.0200-1.2 (received
with ubuntu-natty) and 09.01.0100.

<whimsy>
  This whole thing got started because I was playing around, trying to think
  of a better way for LibreOffice to calculate the width of columns in the
  data pane.
</whimsy>
Comment 20 Terrence Enger 2012-08-22 15:29:25 UTC
Just by-the-way ...

The psql-odbc list announces the availability of psqlODBC 09.01.0200.  
The release notes say that this version fixes the bug in the ODBC 
driver which originally led to this bug report.

I am, alas, separated from my development machine, so I do not expect 
to be able to verify the driver fix anytime soon.

Terry.