Bug 96343 - LO Writer cannot switch the lettercase for Cyrillic Extended-B block
Summary: LO Writer cannot switch the lettercase for Cyrillic Extended-B block
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Linguistic (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: ⁨خالد حسني⁩
URL:
Whiteboard: target:24.2.0
Keywords:
Depends on:
Blocks: Character CaseFolding
  Show dependency treegraph
 
Reported: 2015-12-08 19:31 UTC by Volga
Modified: 2023-07-25 08:36 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
My testcase (243.45 KB, image/gif)
2015-12-25 20:00 UTC, Volga
Details
My testcase (243.45 KB, image/gif)
2015-12-25 20:03 UTC, Volga
Details
Switch the letter case for Cyrillic, copied from Wikipedia. (339.02 KB, image/gif)
2016-03-16 06:10 UTC, Volga
Details
Test file (9.42 KB, application/vnd.oasis.opendocument.text)
2016-11-10 03:30 UTC, Volga
Details
This is what I have done on LODev Writer (92.83 KB, image/gif)
2016-11-10 03:32 UTC, Volga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Volga 2015-12-08 19:31:08 UTC
In LibreOffice Writer, when I try to switch the lettercase for many Cyrillic Extended-B letters, none of them responced. Could you please fix this bug?
Comment 1 Robinson Tryon (qubit) 2015-12-10 07:26:38 UTC Comment hidden (obsolete)
Comment 2 Volga 2015-12-25 20:00:29 UTC
Created attachment 121551 [details]
My testcase

This is my testcase for an Old Cyrillic alphabet, copied from Wikipedia article “Cyrillic script”.
Comment 3 Volga 2015-12-25 20:03:19 UTC
Created attachment 121552 [details]
My testcase

This is my testcase for an Old Cyrillic alphabet, copied from Wikipedia article “Cyrillic script”.
Comment 4 Volga 2016-03-16 06:10:25 UTC
Created attachment 123616 [details]
Switch the letter case for Cyrillic, copied from Wikipedia.

On LibreOffice Dev 5.1.3, Cyrillic Extended-B letters still have no responce when change their lettercase.

Version: 5.1.3.0.0+ (x64)
Build ID: 99ab65e9139d60542344041a96ab8d3a4dc39bdb
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86_64@62-TDF, Branch:libreoffice-5-1, Time: 2016-03-16_01:45:52
Locale: zh-CN (zh_CN)
Comment 5 ⁨خالد حسني⁩ 2016-11-09 13:32:05 UTC
Please attach an actual document.
Comment 6 Volga 2016-11-10 03:30:45 UTC
Created attachment 128629 [details]
Test file
Comment 7 Volga 2016-11-10 03:32:02 UTC
Created attachment 128630 [details]
This is what I have done on LODev Writer

Version: 5.3.0.0.alpha1+
Build ID: 05d2a66955f8a6552a79696474386ca9f45f9ef2
CPU Threads: 4; OS Version: Windows 6.2; UI Render: default; Layout Engine: new; 
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2016-11-07_23:34:48
Locale: zh-CN (zh_CN); Calc: group
Comment 8 Volga 2016-11-10 03:34:12 UTC
You can get the font from here if you need:
http://www.ponomar.net/files/fonts-churchslavonic.zip
Comment 9 ⁨خالد حسني⁩ 2016-11-10 17:33:29 UTC
So it seems we use our own case folding tables in https://gerrit.libreoffice.org/gitweb?p=core.git;a=blob;f=i18nutil/source/utility/casefolding_data.h, we need to either update the tables (will need to understand how they work first) or switch the case mapping to use ICU instead.
Comment 10 Volga 2016-11-22 12:22:10 UTC
Since ICU is already intergrated in LibreOffice, this can be fixed via replacing the own case folding tables.
Comment 11 Aleksandr Andreev 2016-11-26 12:18:17 UTC
I think switching to using ICU would be the better idea in the long run. However, I cannot work on this, since I do not know how ICU is integrated into LO, and I cannot find any documentation of this.
Comment 12 Volga 2016-12-16 13:39:08 UTC Comment hidden (obsolete)
Comment 13 Volga 2017-09-07 13:45:33 UTC
I found documentation at ICU website just now. Does it help?
http://userguide.icu-project.org/transforms/general
Comment 14 Aleksandr Andreev 2018-01-10 16:27:45 UTC
Also related to this problem is the fact that characters in Cyrillic Extended-C (and perhaps some characters in Cyrillic Extended-B as well?) are not recognized as word characters. Hence LibreOffice does not correctly determine word boundaries for Church Slavic text.

This bug is really annoying because it makes my new Church Slavic spell-checker go  berserk.
Comment 15 Xisco Faulí 2018-04-16 09:55:51 UTC
Dear Aleksandr Andreev,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assigned it back to yourself if you're still working on this.
Comment 16 QA Administrators 2019-04-17 02:58:12 UTC Comment hidden (noise)
Comment 17 Volga 2021-06-10 07:28:13 UTC Comment hidden (no-value)
Comment 18 Volga 2022-12-05 18:25:22 UTC
Martin, what do you think of?
Comment 19 Commit Notification 2023-07-24 18:20:48 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9eb88d78c8bc9e942814eb6fc4fe06a4e5736256

tdf#96343, tdf#134766, tdf#97152: Fallback to ICU for case mapping

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Volga 2023-07-25 03:21:39 UTC
Should this to be backported to 7.6 release channel?
Comment 21 ⁨خالد حسني⁩ 2023-07-25 08:36:24 UTC
(In reply to Volga from comment #20)
> Should this to be backported to 7.6 release channel?

The change is a bit too fundamental to backport to 7.6 that late in the cycle.