Bug 130434 - Cyrillic characters are crippled when the copied Base table is pasted in Calc
Summary: Cyrillic characters are crippled when the copied Base table is pasted in Calc
Status: RESOLVED DUPLICATE of bug 126940
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Base (show other bugs)
Version:
(earliest affected)
6.2.8.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-04 17:56 UTC by Sergey Nemna
Modified: 2020-02-10 17:42 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
The troubled file (3.37 KB, application/vnd.sun.xml.base)
2020-02-04 17:58 UTC, Sergey Nemna
Details
Another example with Chinese text in records (3.63 KB, application/vnd.sun.xml.base)
2020-02-05 13:12 UTC, Ming Hua
Details
Copy-paste results for Ming Hua's file (64.25 KB, image/png)
2020-02-05 13:30 UTC, Sergey Nemna
Details
Example database file with different scripts (3.64 KB, application/vnd.sun.xml.base)
2020-02-07 07:02 UTC, Ming Hua
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Nemna 2020-02-04 17:56:04 UTC
Description:
Hello,

I guess this bug may have been discovered and described elsewhere, but I failed at finding it.

I have this Base table (please see attached) containing mixture of Latin and Cyrillic characters. If I copy the table to Calc via the Base context menu «Copy» command, all Cyrillic characters in Calc are replaced with their Latin diacritic counterparts (code page Win 1251 -> Win 1252).

However, should I drag table to the Calc worksheet, it works with no problem. Also, Cyrillic characters are preserved when copying in the opposite direction, from Calc to Base (via either dragging the worksheet to Base or copying Calc cells and pasting them on the «Tables» pane of Base, which results in creation of a new table).

It is not to say this bug makes life impossible, after all dragging from Base to Calc does the deed. Nevertheless, it would be nice to know whether other people are affected and if the fix for this nuisance can be expected any time soon.

P. S. I use LO 6.3.4.2 on Windows 7 x64.

P. P. S. It does not make any difference whether it is HSQLDB or Firebird DB, the bug is still there.

Steps to Reproduce:
1. Open the attached odb.
2. Right-click the only table, select «Copy».
3. Switch to Calc; click «Paste».

Actual Results:
Cyrillic characters are replaced with their Latin diacritical counterparts.

Expected Results:
Cyrillic characters should be preserved.


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Sergey Nemna 2020-02-04 17:58:34 UTC
Created attachment 157648 [details]
The troubled file
Comment 2 Alex Thurgood 2020-02-05 06:59:19 UTC
Testing with 

Version: 6.3.4.2
Build ID: 60da17e045e08f1793c57c00ba83cdfce946d0aa
Threads CPU : 4; OS : Mac OS X 10.15.3; UI Render : par défaut; VCL: osx; 
Locale : fr-FR (fr_FR.UTF-8); Langue IHM : fr-FR
Calc: threaded

I am unable to reproduce the problem, at least as far as I can tell, the Cyrillic characters appear to be correctly copied over to the Calc sheet, irrespective of the method used. Possibly a Windows only bug ?
Comment 3 Sergey Nemna 2020-02-05 08:42:43 UTC
Yes, Alex is right. I can't reproduce it on Lubuntu (the same version of LO) either. Will some one please try to verify the bug on Windows?
Comment 4 Ming Hua 2020-02-05 11:43:23 UTC
I can reproduce with 6.2.8 on Windows 10 with the attached example, both the context menu "Copy" -> wrong characters, and the drag and drop -> correct characters results:
Version: 6.2.8.2 (x64)
Build ID: f82ddfca21ebc1e222a662a32b25c0c9d20169ee
CPU threads: 2; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: zh-CN (zh_CN); UI-Language: en-US
Calc: threaded

Although in my case, the result of pasting in Calc is scrambled characters (the first record shows "§´§Ö§ã§ä" and "§³§ß§à§Ó§Ñ §ä§Ö§ã§ä", for example), not Latin counterparts of Cyrillic characters as Sergey described.  Not very surprising though as this bug is apparently locale dependent.

I think I've seen similar things happen for Chinese text, so this may affect other languages as well.

Also, I don't think this bug is Windows only, but no one is going to reproduce it on Linux with an UTF-8 locale.
Comment 5 Sergey Nemna 2020-02-05 12:24:43 UTC
(In reply to Ming Hua from comment #4)
> I can reproduce with 6.2.8 on Windows 10 with the attached example, both the
> context menu "Copy" -> wrong characters, and the drag and drop -> correct
> characters results:
> Version: 6.2.8.2 (x64)
> Build ID: f82ddfca21ebc1e222a662a32b25c0c9d20169ee
> CPU threads: 2; OS: Windows 10.0; UI render: default; VCL: win; 
> Locale: zh-CN (zh_CN); UI-Language: en-US
> Calc: threaded
> 
> Although in my case, the result of pasting in Calc is scrambled characters
> (the first record shows "§´§Ö§ã§ä" and "§³§ß§à§Ó§Ñ §ä§Ö§ã§ä", for example),
> not Latin counterparts of Cyrillic characters as Sergey described.  Not very
> surprising though as this bug is apparently locale dependent.
> 
> I think I've seen similar things happen for Chinese text, so this may affect
> other languages as well.
> 
> Also, I don't think this bug is Windows only, but no one is going to
> reproduce it on Linux with an UTF-8 locale.

Thank you for the testing you've done (there was a chance some thing was wrong with my profile, but I was not thrilled with idea of resetting it).

I agree the characters should be expected to be scrambled in a different way with another locale.

I am not sure if I understand what exactly is «Linux with an UTF-8 locale», but I tested it with more or less standard Lubuntu (default English interface with Russian layout installed, though) and found it working perfectly fine in all cases described in the original post.
Comment 6 Ming Hua 2020-02-05 12:41:10 UTC
(In reply to Sergey Nemna from comment #5)
>
> I am not sure if I understand what exactly is «Linux with an UTF-8 locale»
Linux distributions all use UTF-8 encoding by default these days.  I think Windows is the only platform reluctant to switch.  Since you seem familiar with the "codepage 1251" notation, UTF-8 is just like another codepage, but accomadating almost all languages.

The equivalent of codepage 1251 for Russian should be ISO-8859-5, so to reproduce this bug on Linux one needs to run LibO in ru_RU.ISO-8859-5 locale, instead of the default ru_RU.UTF-8 (sometime just abbreviated as ru_RU) locale for Russian.  These are all my speculation, of course, as I don't have Linux here to test myself.
Comment 7 Ming Hua 2020-02-05 13:12:57 UTC
Created attachment 157670 [details]
Another example with Chinese text in records

Attached is another example with Chinese text, showing the same behavior:
- Context menu > Copy > Paste in Calc => wrong result
- Drag and drop to Calc => correct result
Comment 8 Sergey Nemna 2020-02-05 13:29:26 UTC
(In reply to Ming Hua from comment #7)
> Created attachment 157670 [details]
> Another example with Chinese text in records
> 
> Attached is another example with Chinese text, showing the same behavior:
> - Context menu > Copy > Paste in Calc => wrong result
> - Drag and drop to Calc => correct result

It looks like there are more puzzles here than we thought it was. Actually, copying-pasting the table of your file results in unchanged characters. Just as dragging does. No difference, really. Check the attached screenshot (it is copy-paste).
Comment 9 Sergey Nemna 2020-02-05 13:30:43 UTC
Created attachment 157671 [details]
Copy-paste results for Ming Hua's file
Comment 10 Ming Hua 2020-02-07 07:02:37 UTC
Created attachment 157716 [details]
Example database file with different scripts

Hmm, such an elusive bug.

I've attached a new example ODB file with ASCII, Latin1, Cyrillic, and Chinese text as separate records.  Hope it helps people with different locales to reproduce this bug.
Comment 11 Ming Hua 2020-02-09 08:17:13 UTC
I've found bug 126940 which looks like exactly the same problem.  And according to that bug it should have been fixed in 6.4.x series.

And indeed, when tested in 6.4.1 RC1:
版本: 6.4.1.1 (x64)
Build ID: 56f3c78975db08733f771c53643b5d1aa7c57567
CPU 线程: 2; 操作系统: Windows 10.0 Build 18363; UI 渲染: GL; VCL: win; 
区域语言: zh-CN (zh_CN); UI 语言: zh-CN
Calc: threaded
Copy-and-paste from Base to Calc works for me.

Sergey, would you please try the newly released 6.4.0 (maybe with parallel installation - https://wiki.documentfoundation.org/Installing_in_parallel) and test if it's fixed for you as well?
Comment 12 Sergey Nemna 2020-02-10 16:39:17 UTC
@Ming Hua

Yes, your guess is totally right. I can confirm it does work in LO 6.4.1 rc1 on Windows for Cyrillic characters. I checked your file with Chinese hieroglyphs and found them to be transferred to Calc without any distortions as well (it worked for me in LO 6.3.4 though). Thank you for the investigation you conducted. I really appreciate your help.

I suppose the issue can now be marked as «Resolved».
Comment 13 Ming Hua 2020-02-10 17:42:23 UTC

*** This bug has been marked as a duplicate of bug 126940 ***