Bug 122716 - IMPORT Writer can lose encoding for some multibyte symbols when copy-paste from XLSX/Calc
Summary: IMPORT Writer can lose encoding for some multibyte symbols when copy-paste fr...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All Windows (All)
: medium minor
Assignee: Mike Kaganski
URL:
Whiteboard: target:25.2.0 target:24.8.4
Keywords: filter:xlsx
: 123010 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-01-14 20:56 UTC by Maxim Britov
Modified: 2024-11-18 11:43 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Really minimal example of issue (4.08 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2019-01-14 20:56 UTC, Maxim Britov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Britov 2019-01-14 20:56:34 UTC
Created attachment 148317 [details]
Really minimal example of issue

We found files created in MS Office, then open/edit/save in some LO versions.
Files is xlsx with translations Russian/Latvian. Some words have symbol "Š".
When we copy that cells from Calc and then paste to Writer we have got wrong symbol.

"Šampūnas" --->> "Љampūnas"

I trackdown issue to very minimal file. File wrong from point os view Excel, but it show issue without xml noise.

Reproduce:
1. Open attached file on LO Calc.
2. Copy A1
3. Create new Writer document
4. Paste. On old, pre 6.0 office, please use Insert as.. and select RTF.
5. You should see "Љampūnas" instead of "Šampūnas".

Tested on 3.6.2.2, 5.4.0.7,6.1.4.3, 6.2.0.2, AOo 4.1.6
Comment 1 Maxim Britov 2019-01-15 09:35:32 UTC
Issue looks really more crazy for me :(
On my Linux desktop works fine, but on tested Windows....

1. From this bug copy Šampūnas and paste into Calc
2. Copy in Calc
3. Create new Writer document
4. Paste
5. I always have wrong Љampūnas on Windows and right Šampūnas in my Linux.

Two Windows box: win7 and win10 both LO 6.1.4.3
Comment 2 Maxim Britov 2019-01-15 09:51:21 UTC
rtf from windows 7 clipboard by clipview.exe tool:

{\rtf\ansi
{\fonttbl{\f0\froman\fprq2\fcharset204 Liberation Serif;}{\f1\fprq2\fcharset204 Segoe UI;}{\f2\fprq2\fcharset204 Tahoma;}{\f3\fnil\fprq0\fcharset204 Arial;}{\f4\fprq2\fcharset204 Microsoft YaHei;}}
{\colortbl;}
{\*\EditEnginePoolDefaults\ltrpar\fi0\li0\ri0\fi0\li0\ri0\sb0\sa0\sl0\slmult0\ql\cf0\f0\fs13\b0\ulnone\strike0\i0\outl0\shad0\kerning0\expndtw0\f1\f2\fs13\fs13\b0\b0\i0\i0\accnone\olnone}
\deftab408
{
\ltrpar\f3\fs22\b0\ulnone\strike0\i0\outl0\shad0\f4\f3\fs22\fs22\b0\b0\i0\i0\accnone\olnone {\f3\fs22\b0\i0 \'8aamp\u363\'3fnas}\par\pard\plain
}}
Comment 3 Durgapriyanka 2019-01-15 16:23:00 UTC
Thank you for reporting the bug. I can confirm this bug in 

Version: 6.3.0.0.alpha0+
Build ID: 3c964980da07892a02d5ac721d80558c459532d0
CPU threads: 2; OS: Windows 6.1; UI render: default; VCL: win; 
TinderBox: Win-x86@42, Branch:master, Time: 2018-12-12_02:07:45
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

&

Version: 6.1.3.2 
Build ID: 86daf60bf00efa86ad547e59e09d6bb77c699acb
CPU threads: 2; OS: Windows 6.1; UI render: default; 
Locale: en-US (en_US); Calc: group threaded
Comment 4 Timur 2019-01-28 14:01:23 UTC
*** Bug 123010 has been marked as a duplicate of this bug. ***
Comment 5 Timur 2019-01-28 14:04:15 UTC
Steps - Windows only (so far):
1. create new LibreOffice Calc document
2. Copy this: Šampūnas
3. Paste into Calc
4. Copy this cell
5. Open LibreOffice Writer Document and Paste 
6. You see Љampūnas instead of Šampūnas.

Workaround: paste unformatted text
Comment 6 Maxim Britov 2019-01-28 14:43:17 UTC
(In reply to Timur from comment #5)

> Workaround: paste unformatted text

Yes. Workaround Ctrl + Alt + Shift + V if you work with such symbols.
May be always use workaround, because no more warranty from LibreOffice... :(
Comment 7 Timur 2019-01-29 15:55:00 UTC
*** Bug 123010 has been marked as a duplicate of this bug. ***
Comment 8 Andreas Heinisch 2022-03-05 12:37:39 UTC
Repro in:
Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 2f95f252312b0de0ab1098561c62bd7ae4527b9c
CPU threads: 6; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-US
Calc: CL
Comment 9 QA Administrators 2024-03-05 03:13:30 UTC Comment hidden (obsolete)
Comment 10 Mike Kaganski 2024-11-05 11:16:48 UTC
https://gerrit.libreoffice.org/c/core/+/176048
Comment 11 Commit Notification 2024-11-06 07:32:46 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0c1ae785e3fb3a800f6b7743a03245dca6c01f14

tdf#122716: take encoding defined for font into account

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2024-11-18 11:43:43 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/05272dc1972b3e1e4f85a7738728315a41922933

tdf#122716: take encoding defined for font into account

It will be available in 24.8.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.