Bug Hunting Session
Bug 57536 - Wrong Chinese conversion: 著 & 着
Summary: Wrong Chinese conversion: 著 & 着
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.6.2.2 release
Hardware: All Windows (All)
: high normal
Assignee: Andras Timar
URL:
Whiteboard: target:4.1.0 target:4.0.2
Keywords:
Depends on:
Blocks:
 
Reported: 2012-11-26 04:05 UTC by Kumāra
Modified: 2013-05-14 19:06 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kumāra 2012-11-26 04:05:16 UTC
Chinese conversion from traditional to simplified should not convert 著 to 着. Neither should it happen in the opposite direction. These are not the same words.
Comment 1 webofht-libreofficebugs002 2012-12-23 15:16:57 UTC
I agree with Kumāra. Please separate 著 from 着. They mean different things. For instance, 見微知著 cannot be 見微知着。 見微知着 is wrong.  著名 cannot be 着名. 着名 is wrong.

This issue also exists in LibreOffice Version 3.6.3.2 (Build ID: 58f22d5) on Debian Linux.

Regards,
C. H. D.
Comment 2 leighman 2012-12-23 22:42:07 UTC
Confirmed by C. H. D.
Setting NEW.
Could this be something that's happening in an external library though?
Comment 3 Kumāra 2013-01-23 06:40:14 UTC
Hope this can be fixed soon. It's a significant bug, as 著 is a widely used together with the name of author in books. So, at the moment, using the conversion feature creates a typo right on the cover page!
Comment 4 Cheng-Chia Tseng 2013-02-20 08:12:32 UTC
One of my freind guesses that this issue should be related to stc_char.dic [1]. You can see "着:著" listed as 1182 item. 

In this case, 着 and 著 is regarded as two characters which has different meaning in mainland China (Simplified Chinese) and Hong Kong (Traditional Chinese) while 着 is treated as a variant form of 著 which has the same meaning in Taiwan (Traditional Chinese) [2].

This existing conversion rule is only true for Traditional Chinese users in Taiwan, but not for the rest (China and Hong Kong). Listed below:

Simplified Chinese to Traditional Chinese
着 => 著 True for Traditional Chinese, Taiwan
着 => 著 False for Traditional Chinese, Hong Kong

Traditional Chinese to Simplified Chinese
著 => 着 Conditional True for Simplified Chinese, China

The easiest way to fix this issue is to remove 着:著 in the list because those two characters are expected as different in Hong Kong and China.

1. http://cgit.freedesktop.org/libreoffice/core/tree/i18npool/source/textconversion/data/stc_char.dic

2. http://dict.variants.moe.edu.tw/yitia/fra/fra03506.htm
Comment 5 Cheng-Chia Tseng 2013-02-20 08:22:41 UTC
However, I am not a promgrammer. I am not sure if the guessing is right or not. It should be investgated.
Comment 6 Kumāra 2013-02-20 10:03:57 UTC
(In reply to comment #4)
> The easiest way to fix this issue is to remove 着:著 in the list because those
> two characters are expected as different in Hong Kong and China.

... and in Malaysia, Singapore, and everywhere else not Taiwan.

Yes, please remove from list. Seems like a dirt easy hack.

I tried looking for stc_char.dic on my HD with Everything. Zip. I guess you're referring to a source file. If there's a file I could edit, I'd do it.
Comment 7 Not Assigned 2013-02-28 07:49:35 UTC
Andras Timar committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d9fb2a6add269955d168d6d31c0257314ea4e020

fdo#57536 remove ç:è, because it is wrong is some cases



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Kumāra 2013-02-28 10:37:52 UTC
(In reply to comment #7)
> Andras Timar committed a patch related to this issue.

Oh, thank you.... :-)
Should the status be set to ASSIGNED?
Comment 9 Not Assigned 2013-02-28 12:26:21 UTC
Andras Timar committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1a562d8f7acd2d0e90eef4b6fdee2c57724203cb&h=libreoffice-4-0

fdo#57536 remove ç:è, because it is wrong is some cases


It will be available in LibreOffice 4.0.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Kumāra 2013-03-01 04:54:57 UTC
(In reply to comment #9)
> Andras Timar committed a patch related to this issue.
> It has been pushed to "libreoffice-4-0":

That's even better.

I wonder saying "Thank you" made a difference... :-)