Bug 104905 - Firebird: Function LOWER doesn't lower special characters
Summary: Firebird: Function LOWER doesn't lower special characters
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on: 105142
Blocks:
  Show dependency treegraph
 
Reported: 2016-12-24 10:45 UTC by Robert Großkopf
Modified: 2019-09-25 13:10 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Open the query "Lower" and have a look at first row - isn't set lower. (3.97 KB, application/vnd.oasis.opendocument.database)
2016-12-24 10:45 UTC, Robert Großkopf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Großkopf 2016-12-24 10:45:19 UTC
Created attachment 129917 [details]
Open the query "Lower" and have a look at first row - isn't set lower.

Open the attached database.
Open the query "Lower".
Special character in the first row hasn't set lower at all.

Query is:
SELECT "Name", LOWER ( "Name" ) FROM "Table2"
Gives:
Ärger | Ärger
Gloria | gloria
Größe | größe

Works right with internal HSQLDB, fails with internal Firebird

Tested with
Version: 5.4.0.0.alpha0+
Build ID: 2a4cd80abcf9e515d1ce3b3a944b573bdc42bff2
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-12-22_00:18:04
Locale: de-DE (de_DE.UTF-8); Calc: group
Comment 1 Robert Großkopf 2016-12-24 11:40:08 UTC
Same behavior with function "Upper". Will set all upper, but doesn't set special characters.

Seems the internal Firebird has a big UTF8-problem. Nearly no function for text will work as expected with special characters. All functions, which will set characters to lower and upper or which will need to count characters won't work right.
Comment 2 Julien Nabet 2016-12-28 13:45:27 UTC
I didn't find some Firebird 3.0 doc about Lower function, only this:
http://firebirdsql.org/refdocs/langrefupd25-intfunc-lower.html

quotation:
"With ASCII or NONE for instance, only ASCII characters are lowercased; with OCTETS, the entire string is returned unchanged"

Lionel/Tamas: thought you might be interested in this one.
Comment 3 Lionel Elie Mamane 2016-12-28 13:52:04 UTC
What character set is assigned? See e.g. http://www.destructor.de/firebird/charsets.htm
Comment 4 Lionel Elie Mamane 2016-12-28 13:56:33 UTC
And collation, also. Try:
 SELECT "Name", LOWER ( "Name" COLLATE UNICODE ) FROM "Table2"
 SELECT "Name", LOWER ( "Name" COLLATE DE_DE ) FROM "Table2"
 SELECT "Name", LOWER ( "Name" COLLATE UTF-8 ) FROM "Table2"
Comment 5 Robert Großkopf 2016-12-28 16:39:48 UTC
(In reply to Lionel Elie Mamane from comment #3)
> What character set is assigned? See e.g.
> http://www.destructor.de/firebird/charsets.htm

Have tried to create a table with direct SQL as defined above. The following error appeared.

COLLATION DE_DE for CHARACTER SET NONE is not defined

I'm only using the internal Firebird. I haven't defined any collation before, because I have never had a database, where no collation hasn't be defined. Don't know if there is any character-set for Firebird installed at all.
Comment 6 Robert Großkopf 2016-12-28 16:44:24 UTC
(In reply to Lionel Elie Mamane from comment #4)
> And collation, also. Try:
>  SELECT "Name", LOWER ( "Name" COLLATE UNICODE ) FROM "Table2"
>  SELECT "Name", LOWER ( "Name" COLLATE DE_DE ) FROM "Table2"
>  SELECT "Name", LOWER ( "Name" COLLATE UTF-8 ) FROM "Table2"

Changing a query like this gives:
COLLATION UTF for CHARACTER SET NONE is not defined
... for every collation. And a query like this won't work in GUI, only direct SQL, so the query couldn't be used for input or changing data any more.
Comment 7 Lionel Elie Mamane 2016-12-28 17:11:16 UTC
(In reply to robert from comment #5)

> Have tried to create a table with direct SQL as defined above. The following
> error appeared.

> COLLATION DE_DE for CHARACTER SET NONE is not defined

Looks like you need to set a character set _and_ a character set. You have "NONE" as character set, need "UTF8" or "ISO8859_1". E.g.:

CREATE TABLE users (
  "NAME" VARCHAR(50) CHARACTER SET UTF8 COLLATE UNICODE,
  ...
);


Possibly we should just set:
 * default character set UTF8 at the database level when it is created
 * UNICODE collation by default when the table is created through GUI (unless we can set that at database level, too?)

Let's first finish this through (see if it works...) and see about setting the default then.

(In reply to robert from comment #6)
> (In reply to Lionel Elie Mamane from comment #4)
> > And collation, also. Try:
> >  SELECT "Name", LOWER ( "Name" COLLATE UNICODE ) FROM "Table2"
> >  SELECT "Name", LOWER ( "Name" COLLATE DE_DE ) FROM "Table2"
> >  SELECT "Name", LOWER ( "Name" COLLATE UTF-8 ) FROM "Table2"
> 
> Changing a query like this gives:
> COLLATION UTF for CHARACTER SET NONE is not defined
> ... for every collation. And a query like this won't work in GUI, only
> direct SQL, so the query couldn't be used for input or changing data any
> more.

I think if the character set / collation are set at the table (or database level), the query will work without an explicit collation. I'm just trying to make a test here.

_After_ setting the character set, maybe
SELECT "Name", LOWER ( "Name" COLLATE 'UTF-8' ) FROM "Table2"
will work better (quotes around UTF-8).
Comment 8 Robert Großkopf 2016-12-28 18:09:39 UTC
(In reply to Lionel Elie Mamane from comment #7)
> (In reply to robert from comment #5)
> 
> Looks like you need to set a character set _and_ a character set. You have
> "NONE" as character set, need "UTF8" or "ISO8859_1". E.g.:
> 
> CREATE TABLE users (
>   "NAME" VARCHAR(50) CHARACTER SET UTF8 COLLATE UNICODE,
>   ...
> );

This one works. It seems it will solve all problems with special characters (Lower, Left, length of characters ...)
Couldn't we set this to database-default? I don't know a possibility to set this for the database after the *.odb-file has been created by LibreOffice.
Comment 9 Alex Thurgood 2017-01-06 11:06:19 UTC
Setting to new, per comment 8.

Perhaps we should open a new report with regard to having the UI at least specify a minimum collation / character set when the table is created like we do already for embedded hsqldb.

The script file in our embedded hsqldb appears to take collation from the locale information, e.g. :

SET DATABASE COLLATION "French"
Comment 10 Alex Thurgood 2017-01-06 11:12:21 UTC
I opened bug 105422 for the automatic generation of a collation statement on new FB3 db creation.
Comment 11 Alex Thurgood 2017-01-06 11:12:45 UTC
Sorry, that should be bug 105142
Comment 12 Tamas Bunth 2017-01-30 14:55:35 UTC
I pushed a patch to master:

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1db423338899c71ba70e361af339d7b7e4aff61f

Note, that to test it you have to create a brand new database (the attachment is not good), because the default character set is set while creating a database.
Comment 13 Robert Großkopf 2017-02-01 18:45:08 UTC
Have tested it, works now.

Thanks to Tamas.

Could we get fixed it also in LO 5.3.1?
Comment 14 dnahrblock 2018-01-22 10:01:39 UTC Comment hidden (spam)
Comment 15 wegmansconnect 2018-11-12 10:19:17 UTC Comment hidden (spam)
Comment 16 ArtShools 2019-08-07 08:15:09 UTC Comment hidden (spam)
Comment 17 likecareer 2019-08-07 08:41:32 UTC Comment hidden (spam)
Comment 18 educationcareeradvisors 2019-08-07 08:51:50 UTC Comment hidden (spam)
Comment 19 elephant-sky 2019-09-25 13:10:21 UTC Comment hidden (spam)