Bug 124054 - FIREBIRD: incorrect ASC order with varchar data
Summary: FIREBIRD: incorrect ASC order with varchar data
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
6.2.1.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Database-Firebird-Default Base-Without-Java
  Show dependency treegraph
 
Reported: 2019-03-13 12:17 UTC by Tim
Modified: 2024-07-22 16:38 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Sort test (4.37 KB, application/vnd.oasis.opendocument.database)
2019-05-21 09:00 UTC, Tim
Details
Data base wrong sort (46.64 KB, application/vnd.oasis.opendocument.database)
2019-05-27 10:45 UTC, Tim
Details
Different sorting in queries: wrong for default, right with Collation Unicode (3.30 KB, application/vnd.oasis.opendocument.database)
2024-07-21 06:18 UTC, Robert Großkopf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim 2019-03-13 12:17:06 UTC
Description:
After migration from hsqldb embedded to firebird embedded, As I have some names all in capitals and others only the first letter capital, the form opens with all capitals first (by order ASC ok) and after that all those with only first letter capital (aswell by ASC order ok).

Actual Results:
ASC order before migration would not make distinction if all letters are capital or only the first one.

Expected Results:
Normal ASC order without distinction between capitals


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Tim 2019-03-13 12:25:45 UTC
The order it follows is:

AAA -> Aaa -> BBB -> Bbb -> CCC -> Ccc
Comment 2 Alex Thurgood 2019-03-13 14:13:06 UTC
In a directly created embedded firebird ODB file, I see the following in a simple form created via the wizard :

AAA
AAa
Aaa
BBB
BBb
CCC
CCc


That seems correct to me. Reading the documentation, it appears that the default sort behaviour for varchar in Firebird is dependent on the collation parameter set when the table is created.
Comment 3 Alex Thurgood 2019-03-13 14:15:21 UTC
Perhaps the migration code sets a specific collation that doesn't match the one that the user set under hsqldb ?

Without a sample hsqldb with which to try a migration, I'm just clutching at straws here.

@Tim : can you please provide a sammple hsqldb embedded ODB file that shows the erroneous behaviour after migration so that we can test.

Setting NEEDINFO
Comment 4 Tim 2019-03-14 10:13:55 UTC
(In reply to Alex Thurgood from comment #2)
> In a directly created embedded firebird ODB file, I see the following in a
> simple form created via the wizard :
> 
> AAA
> AAa
> Aaa
> BBB
> BBb
> CCC
> CCc
> 
> 
> That seems correct to me. Reading the documentation, it appears that the
> default sort behaviour for varchar in Firebird is dependent on the collation
> parameter set when the table is created.

Surely I didn't explain well. This would be the order:

ACUÑA GARRIDO
ALVAREZ DOMÍNGUEZ
AREA GIL
Acosta Martínez
Alonso Fernández
ÁLVAREZ Gómez
Águeda Rodríguez
Comment 5 Tim 2019-03-14 10:20:55 UTC
(In reply to Alex Thurgood from comment #3)
> Perhaps the migration code sets a specific collation that doesn't match the
> one that the user set under hsqldb ?
> 
> Without a sample hsqldb with which to try a migration, I'm just clutching at
> straws here.
> 
> @Tim : can you please provide a sammple hsqldb embedded ODB file that shows
> the erroneous behaviour after migration so that we can test.
> 
> Setting NEEDINFO

I tried to delete records in the tables and leave a few, because I have more than 600, and couldn't do it. This is the Error Message:
SQL Status: 23000
Error code: -8
Integrity constraint violation SYS_FK_198 table: Pers_Minist in statement [DELETE FROM "Personas" WHERE "IdPers" = ?]

I could delete some, one by one, but would take to much time, and it's all personal data I can't share.
Comment 6 Tim 2019-03-14 10:28:07 UTC
(In reply to Tim from comment #4)
> (In reply to Alex Thurgood from comment #2)
> > In a directly created embedded firebird ODB file, I see the following in a
> > simple form created via the wizard :
> > 
> > AAA
> > AAa
> > Aaa
> > BBB
> > BBb
> > CCC
> > CCc
> > 
> > 
> > That seems correct to me. Reading the documentation, it appears that the
> > default sort behaviour for varchar in Firebird is dependent on the collation
> > parameter set when the table is created.
> 
> Surely I didn't explain well. This would be the order:
> 
> ACUÑA GARRIDO
> ALVAREZ DOMÍNGUEZ
> AREA GIL
> Acosta Martínez
> Alonso Fernández
> ÁLVAREZ Gómez
> Águeda Rodríguez

I changed all to lower case except first letter, and except other cases, and I get this out of order, because of accents and lower case:

Alonso Salgado
Alonso Álvarez -> This one should go first
Zincke Pan
de Colombia -> This one should go before "Zincke..."
Álvarez Rodríguez -> This one should go after "Alonso..."
Comment 7 Xisco Faulí 2019-03-21 10:13:35 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
(Please note that the attachment will be public, remove any sensitive information before attaching it. 
See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.)
Comment 8 Lionel Elie Mamane 2019-03-21 10:21:27 UTC
This definitely looks like the collation set on the database/table/field is not the one that the user would find useful. Looks like a NONE/OCTETS/ASCII/UNICODE collation, while user would like some UNICODE_CI/UNICODE_CI_AI/ES_ES_CI_AI
Comment 9 Tim 2019-05-18 22:51:28 UTC
Is there anything I can do to solve this?
Comment 10 Tim 2019-05-18 22:58:10 UTC Comment hidden (obsolete)
Comment 11 Tim 2019-05-18 23:02:34 UTC Comment hidden (obsolete)
Comment 12 QA Administrators 2019-05-19 02:51:47 UTC Comment hidden (obsolete)
Comment 13 Stang 2019-05-19 19:38:43 UTC
@Tim see the answers in the following to set 'collate':

https://ask.libreoffice.org/en/question/191187/sorting-in-base-with-german-umlaute/
Comment 14 Tim 2019-05-21 09:00:08 UTC
Created attachment 151549 [details]
Sort test

I've found a sample I attach (Sort_Test) and both Query1 and Query2 do it well, and only Query2 has the command COLLATE UNICODE asc to work under SQL command directly!

But if I copy the Table and both Queries into one of my Database only Query2 will give the right order ASC.

I also tried the other way round, copy my tables and queries to the Sort-Test, and the order only will be right using collate unicode under sql command directly.
Comment 15 Stang 2019-05-24 21:11:54 UTC
@Tim

You did not comprehend the information presented in the post and links in all answers.

The information in the post does explain what you observe.  If continued lack of understanding, please ask under that post.
Comment 16 Alex Thurgood 2019-05-27 07:59:43 UTC
The real question to be answered, which may or may not be a bug, is that either:

-  the Firebird migration assistant ignores any previously set collation with the existing hsqldb table and sets a default collation which doesn't allow sorting in the way people expect with hsqldb ;

or

- if no default collation was set with the existing hsqldb table, then the Firebird default collation sort behaviour is applied.

As the default behaviour under Firebird is different to hsqldb, the way to workaround this is to:

- either set the collation at the Query level escaping the built-in SQL parser (run in direct SQL mode);

- or ALTER the table definition to add the required collation.


Currently, with the limited information Tim has provided, there is no way of telling where the problem lies.

@Tim: as has been requested many times, we need a representative hsqldb ODB file that displays the behaviour you describe so that we can examine whether it is indeed the migration assistant that is not working correctly, or whether it is just down to a difference in the way Firebird behaves compared to hsqldb in the absence of any specified collation.
Comment 17 Alex Thurgood 2019-05-27 08:16:03 UTC
See also:

http://www.firebirdtest.com/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-ddl-collation.html#fblangref25-ddl-tbl-collatcrt

for CREATE COLLATION statements.

If I have understood this page correctly, for the DDL to succeed, this requires the presence of fbintl.conf, as it is inspected when the statement is run for the presence of the corresponding collation. Unfortunately, I'm not certain that we provide the .conf file in our embedded ODB when it is created, so the DDL might fail silently anyway.
Comment 18 Alex Thurgood 2019-05-27 08:25:43 UTC
Reading bug 116935, I see that there is already a request for COLLATION options to be offered at the UI level...

this implies that a DDL statement to create a collation at table/field level works.

So, if it can't be shown that the migration assistant is faulty, e.g. for ignoring a default pre-set collation in hsqldb, or for setting an incongruous collation on migration, then the workaround is indeed to use the COLLATE statement when creating the table (or subsequently issue an ALTER TABLE statement to add it).

If it can be shown that the migration assistant incorrectly assigns a new collation or ignores an existing collation, then we can confirm this bug report (with a modified title).
Comment 19 Tim 2019-05-27 10:45:59 UTC
Created attachment 151701 [details]
Data base wrong sort

Data base sample requested to find out the reason for not sorting in proper order after migration to firebird.
Comment 20 Alex Thurgood 2019-05-27 14:44:42 UTC
@Tim : thanks !

The SCRIPT file for the embedded hsqldb reads as follows:

SET DATABASE COLLATION "Spanish"
CREATE SCHEMA PUBLIC AUTHORIZATION DBA
CREATE CACHED TABLE "Library"("ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,"BookID" INTEGER,"Title" VARCHAR(100),"Author" VARCHAR(50),"Publisher" VARCHAR(50),"CoverType" VARCHAR(50),"Pages" INTEGER,"PurchasedAt" VARCHAR(50),"DatePurchased" DATE,"Rating" VARCHAR(50),"Notes" LONGVARCHAR,"YearRead" NUMERIC(4),"Genre" INTEGER)
CREATE CACHED TABLE "G\u00e9nero"("ID" INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 0) NOT NULL PRIMARY KEY,"G\u00e9nero" VARCHAR(20))
ALTER TABLE "Library" ADD CONSTRAINT SYS_FK_87 FOREIGN KEY("Genre") REFERENCES "G\u00e9nero"("ID")
SET TABLE "Library" INDEX'6432 7368 78'
SET TABLE "G\u00e9nero" INDEX'19184 13'
ALTER TABLE "Library" ALTER COLUMN "ID" RESTART WITH 78
ALTER TABLE "G\u00e9nero" ALTER COLUMN "ID" RESTART WITH 13
CREATE USER SA PASSWORD ""
GRANT DBA TO SA
SET WRITE_DELAY 60

so at least we can confirm that collation is set in the file.
Comment 21 Tim 2019-05-28 09:02:09 UTC
Thanks! And I've just found out that another bug I've posted seems to have something to do with this one!

The other bug is that after I time opening, writing, editing Writer files, if I open one of my data base I get this error when I try to see the tables o ejecute a Query o Form...

firebird_sdbc error:
*connection shutdown
caused by
'isc_service_attach'
I close Libreoffie, resort and it will go.

Now, what I've found out is that when this happens, if I open a Data base that sorts correctly, because it wasn't imported from hsqldb and was made new with firebird embedded, I don't get the error of connection...!!!
Comment 22 Tim 2019-05-28 09:14:00 UTC
Another observation:

New Firebird embedded Base I can go to Edit -> Database -> Advanced Settings...

In migrated from hsqldb I go to Edit -> Database and then all options are in gray and can't access Advanced Settings...
Comment 23 Tim 2019-05-28 09:15:24 UTC
(In reply to Tim from comment #21)
> Thanks! And I've just found out that another bug I've posted seems to have
> something to do with this one!
> 
> The other bug is that after I time opening, writing, editing Writer files,
> if I open one of my data base I get this error when I try to see the tables
> o ejecute a Query o Form...
> 
> firebird_sdbc error:
> *connection shutdown
> caused by
> 'isc_service_attach'
> I close Libreoffie, resort and it will go.
> 
> Now, what I've found out is that when this happens, if I open a Data base
> that sorts correctly, because it wasn't imported from hsqldb and was made
> new with firebird embedded, I don't get the error of connection...!!!

* "resort" I mean RESART Libreoffice!
Comment 24 QA Administrators 2019-05-29 02:54:03 UTC Comment hidden (obsolete)
Comment 25 Robert Großkopf 2019-06-17 14:46:01 UTC
The order of characters is Firebird-specific. I have changed the title for this.

Note: Has nothing to do with migration, it is the default behaviour of a new created database in LO 6.3.0.0.beta1:

Create a new database
Create a table with ID - INTEGER (primrary key), "Name" - VARCHAR
Input some data:
1  Aaa
2  Ba
3  Za
4  ab
5  ca

This data will be sorted by a query:
SELECT "Name" ORDER BY "Name" ASC
It will be shown the same way as sorted above: Capital letters first.
SELECT "Name" ORDER BY "Name" COLLATE UNICODE ASC
It will only work in direct SQL, so unusable for input data. But it will order "Name" as the order will be in HSQLDB:
Aaa
ab
Ba
ca
Za

I could confirm this buggy behaviour with LO 6.3.0.0.beta1 on OpenSUSE 15, 64bit rpm Linux with a new created Firebird database.
Comment 26 Robert Großkopf 2019-06-17 14:50:34 UTC
(In reply to Robert Großkopf from comment #25)
> This data will be sorted by a query:
> SELECT "Name" FROM "Table" ORDER BY "Name" ASC
> It will be shown the same way as sorted above: Capital letters first.
> SELECT "Name" FROM "Table" ORDER BY "Name" COLLATE UNICODE ASC

Wrong queries - corrected.
Comment 27 Tim 2020-03-27 16:09:51 UTC
(In reply to Robert Großkopf from comment #25)
> The order of characters is Firebird-specific. I have changed the title for
> this.
> 
> Note: Has nothing to do with migration, it is the default behaviour of a new
> created database in LO 6.3.0.0.beta1:
> 
> Create a new database
> Create a table with ID - INTEGER (primrary key), "Name" - VARCHAR
> Input some data:
> 1  Aaa
> 2  Ba
> 3  Za
> 4  ab
> 5  ca
> 
> This data will be sorted by a query:
> SELECT "Name" ORDER BY "Name" ASC
> It will be shown the same way as sorted above: Capital letters first.
> SELECT "Name" ORDER BY "Name" COLLATE UNICODE ASC
> It will only work in direct SQL, so unusable for input data. But it will
> order "Name" as the order will be in HSQLDB:
> Aaa
> ab
> Ba
> ca
> Za
> 
> I could confirm this buggy behaviour with LO 6.3.0.0.beta1 on OpenSUSE 15,
> 64bit rpm Linux with a new created Firebird database.

Still the same behavior with LO 6.4.1.2 under MACOS with a new created Base File using embeded Firebird.
Comment 28 malt25 2022-01-16 10:33:54 UTC
(In reply to Alex Thurgood from comment #16)
> As the default behaviour under Firebird is different to hsqldb, the way to
> workaround this is to:
> 
> - either set the collation at the Query level escaping the built-in SQL
> parser (run in direct SQL mode);
> 
> - or ALTER the table definition to add the required collation.
@Alex Thurgood comment 16-18

Please tell me the exact SQL command to alter the table definition for the whole odb file or one complete table in it to UNICODE ASC collation via Tools/SQL.
Comment 29 malt25 2022-01-17 18:32:25 UTC
Tools/SQL

ALTER CHARACTER SET UTF8
       SET DEFAULT COLLATION UNICODE

changes the collation for new tables to UNICODE, but not for existing tables.
Comment 30 Robert Großkopf 2023-06-28 06:33:04 UTC
Isn't this one a feature request?

There should be executed
ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UNICODE
directly after a new Firebird database has been created.

Have forgotten this in last project and have to execute the SQL-code, copy and paste the tables and reconnect the relationships. Nothing for a "normal user".
Comment 31 Mike Kaganski 2024-07-20 07:16:44 UTC
So, we still don't have a single HSQLDB file with a wanted collation, which we could try converting to Firebird to see how the conversion changes the collation in the result?

This one needs a sample - let me stress it again: a sample HSQLDB embedded file.
Comment 32 Robert Großkopf 2024-07-20 15:16:11 UTC
(In reply to Mike Kaganski from comment #31)
> So, we still don't have a single HSQLDB file with a wanted collation, which
> we could try converting to Firebird to see how the conversion changes the
> collation in the result?
> 
> This one needs a sample - let me stress it again: a sample HSQLDB embedded
> file.

This has nothing to do with converting from HSQLDB to Firebird. HSQLDB will order the entries right way. It will always set the collation to local settings of your LO system, here for example to 
SET DATABASE COLLATION "German"

It is the default of Firebird, which doesn't work well. For internal Firebird it would work to set the collation when creating a new database as described.
It has to be collation UNICODE:
ALTER CHARACTER SET UTF8 SET DEFAULT COLLATION UNICODE
will give the right behavior. But we could also try to set collation by reading the local settings.
But another bug will appear at this moment when you try to set the collation and export internal database to an external database: https://bugs.documentfoundation.org/show_bug.cgi?id=159155
Comment 33 Mike Kaganski 2024-07-20 18:48:41 UTC
Could you please give me very specific detailed steps to see the problem? I am lost, what steps would allow me to see and understand the problem discussed here (and only here, without another bug please :-))
Comment 34 Robert Großkopf 2024-07-21 06:18:19 UTC
Created attachment 195415 [details]
Different sorting in queries: wrong for default, right with Collation Unicode

Open the attached database.
Execute the 2 queries and see the different sorting behavior.
Both have been sorted by "Name", but the default sorting is totally wrong. It will show all content with a capital character at the top.

The query, which sorts with Collation Unicode could only be executed. It is direct SQL. So no values could been added to this query.
Comment 35 Mike Kaganski 2024-07-21 17:04:58 UTC
(In reply to Robert Großkopf from comment #34)
> Both have been sorted by "Name", but the default sorting is totally wrong.
> It will show all content with a capital character at the top.

Why? The default sort, IIUC, just uses the character codes numeric values; and indeed, uppercase characters (usually; always for ASCII) are located before lowercase characters in the encodings.
Comment 36 Robert Großkopf 2024-07-22 05:37:07 UTC
(In reply to Mike Kaganski from comment #35)
> (In reply to Robert Großkopf from comment #34)
> > Both have been sorted by "Name", but the default sorting is totally wrong.
> > It will show all content with a capital character at the top.
> 
> Why? The default sort, IIUC, just uses the character codes numeric values;
> and indeed, uppercase characters (usually; always for ASCII) are located
> before lowercase characters in the encodings.

This is only the reason why it is sorted this way. But it isn't the expected sorting for normal use. Take a dictionary and have a look. Never is sorted 'ca' after 'Za'.
Comment 37 Mike Kaganski 2024-07-22 06:06:10 UTC
(In reply to Robert Großkopf from comment #36)

But is the "dictionary" sorting the expected sorting here? Who told that? (Note that I do not claim it's OK - just need some rationale, beyond "some unrelated thing like dictionary has something else").
This is the essence of my comment 31: I do not see the actual problem here. If it is "the default collation of a new database / table / field should be X", it's one thing (and needs a discussion); if it's "it doesn't copy existing collation when converting tables", it's a different thing (and it would be a direct bug).
Comment 38 Robert Großkopf 2024-07-22 06:48:30 UTC
(In reply to Mike Kaganski from comment #37)
> (In reply to Robert Großkopf from comment #36)
> 
> But is the "dictionary" sorting the expected sorting here? Who told that?
> (Note that I do not claim it's OK - just need some rationale, beyond "some
> unrelated thing like dictionary has something else").
> This is the essence of my comment 31: I do not see the actual problem here.
> If it is "the default collation of a new database / table / field should be
> X", it's one thing (and needs a discussion);

This is the discussion here. The default collation is a collation no other database with direct driver will use. It sorts in a way it is never expected by a user.

Examples: HSQLDB sorts
a
A
b
B 

Same way of sorting for PostgreSQL and direct connection, same with MySQL/MariaDB and direct connection.

SQLite, connected by JDBC, will sort the same as internal Firebird.
Collation UNICODE won't show the same behavior as sorting in all other databases with direct connection. It will set upper cases first and then lower cases. But there is no other collation I tested for Firebird, which would work with UTF8 ans sort the same way as all other databases will sort with direct connection.
Comment 39 jcsanz 2024-07-22 15:57:16 UTC
(In reply to Robert Großkopf from comment #38)
> (In reply to Mike Kaganski from comment #37)
> > (In reply to Robert Großkopf from comment #36)
> > 
> > But is the "dictionary" sorting the expected sorting here? Who told that?
> > (Note that I do not claim it's OK - just need some rationale, beyond "some
> > unrelated thing like dictionary has something else").
> > This is the essence of my comment 31: I do not see the actual problem here.
> > If it is "the default collation of a new database / table / field should be
> > X", it's one thing (and needs a discussion);
> 
> This is the discussion here. The default collation is a collation no other
> database with direct driver will use. It sorts in a way it is never expected
> by a user.
> 
Well, may be not other users, but me always expect capital letters before small letters, as was ordered in the ASCII table, which was the typical order used in computer science 
> Examples: HSQLDB sorts
> a
> A
> b
> B 
> 
> Same way of sorting for PostgreSQL and direct connection, same with
> MySQL/MariaDB and direct connection.
> 
> SQLite, connected by JDBC, will sort the same as internal Firebird.
> Collation UNICODE won't show the same behavior as sorting in all other
> databases with direct connection. It will set upper cases first and then
> lower cases. But there is no other collation I tested for Firebird, which
> would work with UTF8 ans sort the same way as all other databases will sort
> with direct connection.

Also, as ASCII, in the Unicode order, capital letters are placed before small letters, and both of them before accented letters, if you order them by their number code.
Comment 40 Robert Großkopf 2024-07-22 16:38:44 UTC
Please: This all has nothing to do with the reason for the order in Firebird. It has something to do what a user expect when he starts a form with many rows of data and will sort this data to get data he want. And the user expects it the way HSQLDB, MariaDB and PostgreSQL, all with drivers from LO, will sort. If this is sorted 'a' in front of 'A' or after 'A' isn't the problem. But 'a' after 'Z' is a horror.

Example: I have a table for invoice codes - international published, only this table should be used. I try to order this table by the name of the code.
Find "A Wheeled pallet with raised rim (81 x 72 x 135)" at row 4.
Find "Wooden pallet 40 cm x 80 cm" at row 490.
Find "access line" at row 491.
The whole table has 2161 rows. Lower cased letter start at row 491.

And now argue with ASCII and computer science. When trying this I didn't get anybody, who will use this kind of sorting in his database and all people will ask me: "Why can't you create the database in HSQLDB? It isn't usable for me this way."