Bug 103528 - 涅槃 should not be converted to Simplified Chinese as 涅盘
Summary: 涅槃 should not be converted to Simplified Chinese as 涅盘
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CJK
  Show dependency treegraph
 
Reported: 2016-10-27 02:42 UTC by Kumāra
Modified: 2017-11-13 04:14 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
OOo era design document for the Chinese language conversion tool (71.25 KB, application/vnd.sun.xml.writer)
2016-10-31 14:29 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kumāra 2016-10-27 02:42:58 UTC
涅槃 is valid as simplified Chinese. Although 槃 may be simplified as 盘 in some cases, but it should remain the same in 涅槃.
Comment 1 Aron Budea 2016-10-27 03:10:25 UTC
Where does this happen? Spell checker? Autocorrect?
Comment 2 Kumāra 2016-10-27 03:52:55 UTC
Sorry, I was assuming it's obvious. It's about Chinese conversion from Traditional to Simplified Chinese.
Comment 3 Mark Hung 2016-10-28 13:44:09 UTC
It's in tool / language / Chinese conversion
Comment 4 V Stuart Foote 2016-10-28 14:06:47 UTC
Also, what OS and Desktop are you using--and which build of LibreOffice?
Comment 5 V Stuart Foote 2016-10-28 15:07:13 UTC
涅槃 (Nièpán) "Nirvana" is a concept and proper name from 佛教 (Fójiào) Buddhism, and my simplified dictionary shows it retains the traditional form of the name.

Imagine there is a whole class of literary and liturgical terms that should not be simplified--but I have no idea how that is organized, or even what external module (i.e. Hunspell or LanguageTool or ???) provides the Chinese lexical support/conversion.
Comment 6 Kumāra 2016-10-31 05:56:51 UTC
(In reply to V Stuart Foote from comment #4)
> Also, what OS and Desktop are you using--and which build of LibreOffice?

I believe the issue is unrelated to these. I thought "unspecified" is the right thing to specify, but come to think of it "Inherited form OOo" is more accurate. Sorry about that.

I'm not a coder, but I wonder if having a look at Bug 46182 might help to find a solution to this.

Can we mark this as NEW?
Comment 7 Adolfo Jayme Barrientos 2016-10-31 12:58:47 UTC
→ NEW it goes
Comment 8 V Stuart Foote 2016-10-31 14:29:11 UTC
Created attachment 128382 [details]
OOo era design document for the Chinese language conversion tool

(In reply to Kumāra from comment #6)
> (In reply to V Stuart Foote from comment #4)
> > Also, what OS and Desktop are you using--and which build of LibreOffice?
> 
> I believe the issue is unrelated to these. I thought "unspecified" is the
> right thing to specify, but come to think of it "Inherited form OOo" is more
> accurate. Sorry about that.
>

As the project's code is modified to function on each OS--knowing the OS and Desktop environment (where applicable) helps us to confirm and reproduce issues. And importantly to identify where/when in the source things change.
 
> 
> I'm not a coder, but I wonder if having a look at Bug 46182 might help to
> find a solution to this.
>

Yes looks like "Inherited from OOo" is correct. The word lists provided for Chinese in stc_char.dic and stc_word.dic have received little adjustment since OOo era. One of the original design documents from 2004 for function of the tool is attached.

Line 1176 of the stc_char.dic holds the pán entries for 盘 (U+76d8) and 槃 (U+69c3)

What is unclear in the source is the relation between the single character conversion, and the bound form/word listing. Also, the syntax of the character and word list is a little unclear.

niè 涅 (U+6d85), or more correctly nièpán 涅槃 has no entry in either the character or the word list--so IIUC the single replacement occurs. But it looks like it may be as simple as adding the literary and liturgical terms to the stc_word.dic with matching simplified and traditional values--as a "place holder"--to ensure that single character substitution does not occur. Of course there would be an upper limit to what the word table could hold.

> Can we mark this as NEW?

Certainly.

=-ref-=
http://opengrok.libreoffice.org/xref/core/i18npool/source/textconversion/genconv_dict.cxx
http://opengrok.libreoffice.org/xref/core/i18npool/inc/textconversion.hxx
http://opengrok.libreoffice.org/xref/core/i18npool/source/textconversion/data/stc_char.dic
http://opengrok.libreoffice.org/xref/core/i18npool/source/textconversion/data/stc_word.dic
Comment 9 Kumāra 2016-11-01 02:01:45 UTC
Thanks. Anything else I can do to help?
Comment 10 Kumāra 2016-11-02 07:58:54 UTC
I'm changing the importance to minor.

Reason: Although in officialdom, 涅槃 is regarded correct, I suspect 涅盘, being just a transliteration of "Nirvana" (Sanskrit) or "Nibbana" (Pali), may eventually be accepted through common usage, mainly due to precisely such a machine conversion.
Comment 11 QA Administrators 2017-11-03 08:05:25 UTC Comment hidden (obsolete)
Comment 12 Kumāra 2017-11-13 04:14:54 UTC
I think we can forget this.