Created attachment 47456 [details] Windows platform. There is a user complained that copying Odb file (using Traditinal Chinese) data to Calc has characters which are encoed wrongly. You can get the file at here: https://docs.google.com/leaf?id=0B9JKiYcC-SFQMDIwNjA5ODItMzFkOC00M2M3LTgxYTUtMDc4NmEwYzY2YWMy&hl=en_US&authkey=CMqpnt0H I tested LibO 3.4 RC2, the problem do exist. But the problem does not occur on LibO 3.3.2 Linux platform. See the attachments for output of Windows platform (having bug), and Linux platform (having no bug).
Created attachment 47457 [details] Linux platform
In all, the copying action from Base will produce characters which are not eoncoded well (http://en.wikipedia.org/wiki/Mojibake) in Calc of Windows, but not in Calc of Linux.
I could not reproduce the error on Windows with LibreOffice 3.4.0. Can you please give the steps you made? It works well via Data Sources (F4) in Calc, also copy & paste strings from Base works.
I use the mehtod in this blog post: http://openoffice.blogs.com/openoffice/2007/04/farrrrrr_simple.html Go to "Table" first, then right click the Data I would like to "Copy" to Calc. This would create Data in wrong encoding. Plus, drag the Data icon directly and drop to Calc, the problem can also be avoided.
RC2 is bit by bit identical with release version, so separate items in the version picker are useless. Changes have been discussed with Michael Meeks.
It is not fixed yet in 3.4.2 RC3. Andras, could you verify this bug again? Thanks.
*** Bug 40766 has been marked as a duplicate of this bug. ***
Is the bug still there on 3.4.4 ? What's the font and encoding to use to test ?
It is still there. I don't know actually the encoding is, but Chinese (Traditional) Windows usually use "big5" as default. The problem only exist when you use the "Copy" item of the context menu to paste on Calc. However, it you open Calc first, and drag the table in odb file directly to the Calc, and everything is fine. This is the expected output.
Repruduced. Windows XP 32 bit LibO 3.4 russian language Problem is more interesting than expected. When I copied table from database and inserted it into Calc, it inserted without problem. Then I changed one field in database to text on russian (Cyrillic) , copied table and inserted to Calc. All except russian inserted ok, but russian text looks wrong. To reproduce this looking of russian text manually, I have saved document with russian text as text in Windows encoding, then opened it in webbrowser and changed encoding to ISO-8859-1. Characters become looking as described above problem. Therefore it is locale-specific problem. This bug resembles Bug 39890
[This is an automated message.] This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it started right out as NEW without ever being explicitly confirmed. The bug is changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases. Details on how to test the 3.5.0 beta1 can be found at: http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1 more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
needinfo keyword redundant by needinfo status.
The bug is still there. Using the mehtod in this blog post: http://openoffice.blogs.com/openoffice/2007/04/farrrrrr_simple.html Go to "Table" first, then right click the Data I would like to "Copy" to Calc. This would create Data in wrong encoding. Plus, drag the Data icon directly and drop to Calc, the problem can also be avoided.
The data formats for both RTF and HTML contain certain text in a system default encoding, but it is either marked as a charset 0 or windows-1252.
Possibly related to Bug 36144
The culprit probably is /core/dbaccess/source/ui/misc/TokenWriter.cxx:421 Also, the logic here seems strange to me: /core/svtools/source/svrtf/rtfout.cxx:118
Any update with last LO stable version, 4.2.5?
Bug existed, reproducible on 4.3.0.
Cheng-Chia: Thank you for your feedback, put it back to NEW.
I can confirm the issue, exactly as described elsewhere in this thread, except this time for Greek fonts. Libreoffice Version: 4.3.0.4 Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0 on Windows 7 Pro
Please see this Bug 79631 where it seems that Dominik has tackled the issue in version 4.4.0.0.alpha0+ . And apologies for having forgotten my own submission...
On pc Debian x86-64 with 4.3.2 Debian package, I don't reproduce the very similar fdo#79631 put in See Also Cheng-Chia: Since, I don't have Google account, could you give a new try with 4.3.2 version?
As reported, this bug only existed on Windows platform. Linux is not affected. You can get the file at https://www.dropbox.com/s/tbb7bgffees5igj/%E5%B7%A5%E7%A8%8B%E7%AE%A1%E7%90%86%E8%B3%87%E6%96%99%E5%BA%AB2.odb?dl=0 Tested with version 4.3.2 on Windows 7 64bit, this long life bug exists still.
If one needs another file for testing, the attachment in bug 79631 does the trick. It is a small odb file containing a single table with fonts in various encodings exhibiting the issue. Here is the link: https://bugs.freedesktop.org/attachment.cgi?id=100392
The text in the 'system' encoding on Windows is copied to RTF as ANSI. It is marked by a font with \charset0. That is what causes the issue.
*** Bug 79631 has been marked as a duplicate of this bug. ***
If you want to reproduce it, you will need a Windows OS set to any legacy codepage than 1252. P.S. Bugs do not fix themselves by magic after two years.
Urmas: bug may indeed be "magically" fixed sometimes when: - a similar bug has been fixed - some code part has been redesigned - a problem indicated by code analyzers (like coverity scan, cppcheck and other), which was the root cause of the bug, has been fixed etc. Of course, I wouldn't be able to give you any probalities but it does happen sometimes! :-)
Adding self to CC if not already on
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.0.4 or later) https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT: - Update the version field - Reply via email (please reply directly on the bug tracker) - Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for your help! -- The LibreOffice QA Team This NEW Message was generated on: 2016-01-17
*** Bug 97346 has been marked as a duplicate of this bug. ***
Hello, the bug remains, using Libreoffice Version: 5.0.4.2, on Windows 7. When I copy a whole row or table using right-click(or menu) the text pasted (at least for Greek and Hebrew) is wrong. On the contrary, when drag and drop table from Base to Calc, or copy single cell, it is OK. Used again the file mentioned on Comment 24. Maybe this piece of info helps: If one chooses to "paste special" in the case of bug showing the options are just RTF and HTML, in the case of no bug only Unformatted text.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (5.2.5 or 5.3.0 https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to "inherited from OOo"; 4b. If the bug was not present in 3.3 - add "regression" to keyword Feel free to come ask questions or to say hello in our QA chat: http://webchat.freenode.net/?channels=libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug-20170306
the bug remains, using Libreoffice Version: 5.2.2.2, on Windows 7. Using the file mentioned in comment 24 , again the -wrongly encoded- greek text is pasted as îåóêåðÜæù ôçí øõ÷ïöèüñá âäåëõãìßá instead of ξεσκεπάζω την ψυχοφθόρα βδελυγμία So, no change during 2016...
I'm not an expert,but I wonder what would happen if we explicitly specify appropriate encoding as the fifth parameter to SfxFrameHTMLWriter::Out_DocInfo in OHTMLImportExport::WriteHeader rather than relying on its default parameter? https://github.com/LibreOffice/core/blob/39adbb9593c764429e9ed2176dde755809b3af0f/dbaccess/source/ui/misc/TokenWriter.cxx#L677
Thank you Urmas and himajin100000, let's give a try with https://gerrit.libreoffice.org/#/c/38253/ Urmas: I know that's it's a cold case but if you have some time, could you be more explicit about svl part of https://bugs.documentfoundation.org/show_bug.cgi?id=37859#c16 ? I suppose it concerns Out_Char function and most particularly this part: 130 //If we can't convert to the dest encoding, or if 131 //it's an uncommon multibyte sequence which most 132 //readers won't be able to handle correctly, then 133 //export as unicode 134 OUString sBuf(&c, 1); 135 OString sConverted; 136 sal_uInt32 nFlags = 137 RTL_UNICODETOTEXT_FLAGS_UNDEFINED_ERROR | 138 RTL_UNICODETOTEXT_FLAGS_INVALID_ERROR; 139 bool bWriteAsUnicode = !(sBuf.convertToString(&sConverted, 140 eDestEnc, nFlags)) 141 || (RTL_TEXTENCODING_UTF8==eDestEnc); // #i43933# do not export UTF-8 chars in RTF; 142 if (bWriteAsUnicode) 143 { 144 (void)sBuf.convertToString(&sConverted, 145 eDestEnc, OUSTRING_TO_OSTRING_CVTFLAGS); 146 } 147 const sal_Int32 nLen = sConverted.getLength(); See http://opengrok.libreoffice.org/xref/core/svtools/source/svrtf/rtfout.cxx#130 If you confirm, I think it could be interesting to have a bugtracker about this specific part with a failing case.
Julien Nabet committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=39487b14956d883899311b6294f6f09ca2371366 tdf#37859: Odb data copied to Calc showed wrong encoding in Windows It will be available in 5.5.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-5-4": http://cgit.freedesktop.org/libreoffice/core/commit/?id=a485908af200fadd561af0a5011276613849e356&h=libreoffice-5-4 tdf#37859: Odb data copied to Calc showed wrong encoding in Windows It will be available in 5.4.0.1. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-5-3": http://cgit.freedesktop.org/libreoffice/core/commit/?id=3951d44110df55589ff80f5eab752817c2475c0d&h=libreoffice-5-3 tdf#37859: Odb data copied to Calc showed wrong encoding in Windows It will be available in 5.3.5. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Let's put this one to FIXED. Don't hesitate to reopen this tracker if it still fails with a build which includes the patch.
Tested with the dev build. Previously both paste as RTF and paste as HTML didn't encode the text correctly but now paste as HTML encodes correctly. Still paste as RTF doesn't encode correctly. Version: 5.5.0.0.alpha0+ (x64) Build ID: 076ed447f694239d5c67adee528ea6e471d909ff CPU threads: 8; OS: Windows 6.19; UI render: GL; TinderBox: Win-x86_64@42, Branch:master, Time: 2017-06-10_01:17:34 Locale: ja-JP (ja_JP); Calc: CL
TODO for me: Check whether "Options"-"Load/Save"-"HTML Compatibility"-"Export"-"Character set" affects this behavior.
*** Bug 97364 has been marked as a duplicate of this bug. ***
*** Bug 97365 has been marked as a duplicate of this bug. ***