Bug 104956 - Firebird: Tabledesign - VARCHAR counts wrong with special characters
Summary: Firebird: Tabledesign - VARCHAR counts wrong with special characters
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-28 10:37 UTC by Robert Großkopf
Modified: 2022-01-02 16:09 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Open the table and try to put text with special characters in the 5-characters-field - is counted wrong. (2.81 KB, application/vnd.oasis.opendocument.database)
2016-12-28 10:37 UTC, Robert Großkopf
Details
test (2.88 KB, application/vnd.oasis.opendocument.database)
2022-01-02 15:25 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Großkopf 2016-12-28 10:37:06 UTC
Created attachment 129981 [details]
Open the table and try to put text with special characters in the 5-characters-field - is counted wrong.

This bug has the same reason as bug104904 and bug104905. Firebird couldn't handle counting of special characters correct.

Have a look at the table of the attached database. There is a field called "Text_5_char". It is defined with a length of 5 characters. First row does show 5 characters, but in the second and the third row you couldn't add any character: There is a special character in the text, which has been counted as two characters.
Have a look at the fourth row. There seem to be 5 characters, but the the last character is '�'. I have tried to input 'zuständig' to show the special character 'ä' has been split and couldn't be recognized at the end of the field.

All tested with
Version: 5.4.0.0.alpha0+
Build ID: 2a4cd80abcf9e515d1ce3b3a944b573bdc42bff2
CPU Threads: 4; OS Version: Linux 4.1; UI Render: default; VCL: kde4; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-12-22_00:18:04
Locale: de-DE (de_DE.UTF-8); Calc: group
Comment 1 Buovjaga 2017-01-03 16:56:20 UTC
Confirmed.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha0+
Build ID: fc0d4e6bc43d5f982452df07930f5ecf5927ad22
CPU Threads: 8; OS Version: Linux 4.8; UI Render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on December 31st 2016
Comment 2 Robert Großkopf 2017-02-02 18:00:01 UTC
This one has been fixed with fix of bug 105142. But another bug appears: Fieldlength couldn't be set to less than 20.

Will set this bug to fixed and open a new bug for fieldlength.
Comment 3 Robert Großkopf 2019-12-29 20:07:16 UTC
Have tested it again. Bug hasn't gone, but has been changed:
With a new created Firebird database you could put 20 ASCII-Characters into a field created for 5 characters, but could input 10 special characters into the same fields.
You couldn't add 9 special characters, one ASCII-character and another special character. It will throw the error
firebird_sdbc error:
*Malformed string
caused by
'isc_dsql_execute'
... instead of the '�' in the example database.

All tested with LO 6.3.4.2 on OpenSUSE 15.1 64bit rpm Linux
Comment 4 Buovjaga 2020-01-04 19:10:26 UTC
Let's put back to NEW, then as no one attempted to tackle this specific report.
Comment 5 Julien Nabet 2022-01-02 15:25:15 UTC
Created attachment 177244 [details]
test

On pc Debian x86-64 with master sources updated today, I don't reproduce this.

I mean, I created a brand new odb file with Firebird embedded (so had to enable experimental features).
1)
select * from RDB$DATABASE

shows database uses UTF8

2)
SELECT 
        relfields.RDB$RELATION_NAME, 
        relfields.RDB$FIELD_NAME,
        relfields.RDB$DESCRIPTION,
        fields.RDB$FIELD_TYPE,      
        fields.RDB$FIELD_SUB_TYPE,  
        fields.RDB$FIELD_LENGTH,    
        fields.RDB$CHARACTER_LENGTH,  
        charset.RDB$CHARACTER_SET_NAME
        FROM RDB$RELATION_FIELDS relfields 
        JOIN RDB$FIELDS fields 
        on (fields.RDB$FIELD_NAME = relfields.RDB$FIELD_SOURCE) 
        LEFT JOIN RDB$CHARACTER_SETS charset 
        on (fields.RDB$CHARACTER_SET_ID = charset.RDB$CHARACTER_SET_ID) 
        WHERE (1 = 1) 
        AND relfields.RDB$RELATION_NAME = 'Table1'

shows that number of character set is UTF8 too, character length (the number of characters) is 6 but field_length (which corresponds to the number of bytes when it's a string) is 24.
I could put until 6 characters (including or not specific characters like é) and not more even if not specific characters.
Indeed, trying to "abcdefg" gives:
Error inserting the new record /home/julien/lo/libreoffice/connectivity/source/commontools/dbtools.cxx:745
Error code: 1

firebird_sdbc error:
*arithmetic exception, numeric overflow, or string truncation
*string right truncation
*expected length 6, actual 7
caused by
'isc_dsql_execute'
 /home/julien/lo/libreoffice/connectivity/source/drivers/firebird/Util.cxx:68

Did I miss something?
Comment 6 Robert Großkopf 2022-01-02 16:04:45 UTC
Whatever has been changed there:
I could confirm it is working with a new created internal Firebird database. It isn't working with the attached Firebird database from 2016-12-28

Characters are counted right without problems. Special characters are handled the same as ASCII characters. Also the length reported by CHAR_LENGTH() is working now.

Tested with a new created Firebird database, LO 7.3.0.1 under OpenSUSE 15.3.

I will put this one to Worksforme.
Comment 7 Julien Nabet 2022-01-02 16:09:45 UTC
Thank you Robert for your feedback! :-)