Bug 112594 - Mongolian letters failed to join with NNBSP when it is preceded by different script group
Summary: Mongolian letters failed to join with NNBSP when it is preceded by different ...
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
(earliest affected) release
Hardware: All All
: medium normal
Assignee: Khaled Hosny
Whiteboard: target:24.2.0 target:7.6.1
Depends on:
Blocks: Formatting-Mark China-Minority-Scripts RTL Language-Grouping
  Show dependency treegraph
Reported: 2017-09-23 17:47 UTC by Volga
Modified: 2024-08-03 19:02 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:

Test file (24.26 KB, application/vnd.oasis.opendocument.text)
2017-09-23 17:47 UTC, Volga
Rendering on LibreOffice (185.72 KB, image/png)
2017-09-23 17:48 UTC, Volga
Original text rendering on Firefox (306.20 KB, image/png)
2017-09-23 17:51 UTC, Volga
Original text rendering on Chrome (293.46 KB, image/png)
2017-09-23 17:59 UTC, Volga
Rendering on LibreOffice 7.0.3 (253.87 KB, image/png)
2020-11-27 15:00 UTC, Volga

Note You need to log in before you can comment on or make changes to this bug.
Description Volga 2017-09-23 17:47:02 UTC
Mongolian letters sometimes does not join with NNBSP, especially when this space is not following Mongolian letters.

Steps to Reproduce:
1. Open Writer
2. Insert -> Frame -> Frame Interactively
3. Set text direction as Left-to-Right (vertical)
4. Copy the text from https://incubator.wikimedia.org/wiki/Wp/mvf/%E1%A0%AD%E1%A0%A6%E1%A0%B6%E1%A0%A6%E1%A0%AD_%E1%A0%AC%E1%A0%A0%E1%A0%AD%E1%A0%A0%E1%A0%A8_%E1%A0%A4_%E1%A0%AA%E1%A0%A2%E1%A0%B4%E1%A0%A2%E1%A0%AD%E1%A0%8C
5. Select all texts, set western font as Liberation Serif, set complex font as Mongolian Baiti.
6. Copy & paste the frame
7. Select all texts at the next frame, set font as Mongolian Baiti.

Actual Results:  
Mongolian letters failed to join with NNBSP if NNBSP followed by digits. See my screenshot.

Expected Results:
Mongolian letters should always join with NNBSP.

Reproducible: Always

User Profile Reset: No

Additional Info:
版本: (x64)
Build ID:dfa67a98bede79c671438308dc9036d50465d2cb
CPU 线程:4; 操作系统:Windows 6.19; UI 渲染:默认; 
区域语言:zh-CN (zh_CN); Calc: group

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0
Comment 1 Volga 2017-09-23 17:47:42 UTC
Created attachment 136491 [details]
Test file
Comment 2 Volga 2017-09-23 17:48:39 UTC
Created attachment 136492 [details]
Rendering on LibreOffice
Comment 3 Volga 2017-09-23 17:51:09 UTC
Created attachment 136493 [details]
Original text rendering on Firefox
Comment 4 Volga 2017-09-23 17:59:52 UTC
Created attachment 136494 [details]
Original text rendering on Chrome
Comment 5 Buovjaga 2017-10-28 15:59:01 UTC
So the problem is before the string [ 1246 ?
Comment 6 Volga 2017-10-31 03:17:52 UTC
(In reply to Buovjaga from comment #5)
> So the problem is before the string [ 1246 ?

No, it’s after the string 1246.
Comment 7 Buovjaga 2017-10-31 18:46:05 UTC
Ok, confirmed.

Arch Linux 64-bit, KDE Plasma 5
Build ID: 7a2e7c32d38db02aaa5d78d5e8aaf86cabfde586
CPU threads: 8; OS: Linux 4.13; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on October 28th 2017
Comment 8 Volga 2017-12-01 11:02:54 UTC Comment hidden (no-value)
Comment 9 Volga 2017-12-01 14:11:41 UTC
Is there anyway to make NNBSP works the same as ZWJ?
Comment 10 Volga 2017-12-23 16:51:23 UTC
Still reproduce in 

Version: (x64)
Build ID:d2bec56d7865f05a1003dc88449f2b0fdd85309a
CPU 线程:4; 操作系统:Windows 10.0; UI 渲染:默认; 
区域语言:zh-CN (zh_CN); Calc: group

As a workaround, inserting ZWJ between space and Mongolian letter would works. This can be done in the following steps:
1. Type “200D” between them
2. Select the code
3. Press Alt + X

This method could making Mongolian suffixes works as expected in this case, but couls making them breakable, further more, so many Mongolian fonts having contextual forms after NNBSP, so there is necessary to improve LibO to make Hudum Mongolian text well performanced.
Comment 11 QA Administrators 2018-12-24 03:44:21 UTC Comment hidden (obsolete)
Comment 12 Kevin Suo 2020-11-23 16:04:54 UTC Comment hidden (obsolete)
Comment 13 Volga 2020-11-27 15:00:27 UTC Comment hidden (obsolete)
Comment 14 QA Administrators 2022-11-28 03:36:14 UTC Comment hidden (obsolete)
Comment 15 Khaled Hosny 2023-07-27 11:24:37 UTC
Here is a simpler test string:

[ 1246 ᠣᠨ ᠤ 11ᠰᠠᠷᠠᠠ ᠢᠨ 3 ᠤ ᠡᠳᠤᠷ ᠡᠴᠡ11 ᠤ ᠡᠳᠤᠷ ]

It does not have to be set vertically either. The character after the 3 and the second 11 should look like the one before the first 11.

It seems to be a text segmentation issue; we classify the number as Western text and the NNBSP after them seems to be classified with them as well.
Comment 16 Khaled Hosny 2023-07-27 14:36:00 UTC
(In reply to Volga from comment #9)
> Is there anyway to make NNBSP works the same as ZWJ?

I got fooled by this visible difference for the better part of the day, but actually we are treating NNBSP and ZWJ the same here, they are always grouped with the previous character when we itemize scripts, so they are part of the Western text here.

The difference is that HarfBuzz handles ZWJ itself and it is cleaver enough to still handle it even when asked to shape only the part of the text string after ZWJ. The NNBSP, on the other hand seems to be handled by the font itself (using glyph substitutions), so when we ask for shaping only the part of the text string after NNBSP, font substitutions don’t get applied.
Comment 17 Commit Notification 2023-07-27 21:50:04 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "master":


tdf#112594: Group NNBSP with the Mongolian characters after it

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:

Affected users are encouraged to test the fix and report feedback.
Comment 18 Volga 2023-07-28 02:07:57 UTC
Is it possible to backport to 7.6 release channel?
Comment 19 Commit Notification 2023-07-31 10:29:23 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":


tdf#112594: Group NNBSP with the Mongolian characters after it

It will be available in 7.6.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:

Affected users are encouraged to test the fix and report feedback.