Bug Hunting Session
Bug 127902 - Indic: Visarga characters should combine correctly after Vedic tone markers
Summary: Indic: Visarga characters should combine correctly after Vedic tone markers
Status: RESOLVED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
6.3.2.2 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Font-Rendering
  Show dependency treegraph
 
Reported: 2019-10-01 11:23 UTC by Shriramana Sharma
Modified: 2019-10-15 03:50 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Screenshot illustrating bug (6.73 KB, image/png)
2019-10-01 11:23 UTC, Shriramana Sharma
Details
Font supporting relevant characters (74.76 KB, application/octet-stream)
2019-10-02 11:31 UTC, Shriramana Sharma
Details
Script to produce random test samples (1.91 KB, text/x-python)
2019-10-02 11:33 UTC, Shriramana Sharma
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Shriramana Sharma 2019-10-01 11:23:03 UTC
Created attachment 154676 [details]
Screenshot illustrating bug

Please see the attachment showing the sequence दे॒वेभ्य॑ः. This is created by latest LibreOffice 6.3.2 on Kubuntu Bionic LTS. (I do not use Windows so did not test this there.)

It is observed that there is a dotted circle before the visarga but there should not be.

The same is observed with other common Devanagari fonts like Noto Sans Devanagari and Lohit Devanagari. This needs to be fixed.

Reason:

Indic visarga characters are always spacing and placed to the right of the syllable. Vedic tone markers whether spacing or non-spacing will always be input before visarga because they apply to the vowel before the visarga.

So the expected sequence is:

syllable + zero or more tone markers + visarga

and so sequences like the example above where the visarga is placed after such a tone marker should not cause a dotted circle due to cluster breakup.

The file http://www.unicode.org/Public/UNIDATA/IndicSyllabicCategory.txt does an excellent job of listing the visarga characters and the tone markers under the sections:

Indic_Syllabic_Category=Visarga
Indic_Syllabic_Category=Cantillation_Mark
Comment 1 V Stuart Foote 2019-10-01 14:22:47 UTC
Please provide a sample document (simple text runs in several of the fonts that support the combining glyphs).
Comment 2 Shriramana Sharma 2019-10-02 11:31:23 UTC
Created attachment 154697 [details]
Font supporting relevant characters
Comment 3 Shriramana Sharma 2019-10-02 11:33:23 UTC
Created attachment 154698 [details]
Script to produce random test samples
Comment 4 Shriramana Sharma 2019-10-02 11:36:00 UTC
Hello. There was some problem in auto-detecting and manually setting the content type of the font to font/ttf. Don't know why. So have set to application/octet-stream.

Anyway, I am not aware of any one publicly available font that supports all relevant characters since Vedic is a rare use case. I have uploaded the OFL Lohit Devanagari locally modified by myself adding the extra required characters.

I have also uploaded a Python script to produce test case sequences. Hope this would be sufficient.

Note that currently the script only produces test cases for Devanagari as I only have a Vedic-supporting font for that script, but you can see that it can be easily toggled to printing randomly other scripts also.
Comment 5 V Stuart Foote 2019-10-02 14:44:41 UTC
Confirmed on a Windows 10 Home 64-bit en-US with 
Version: 6.3.2.2 (x64)
Build ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU threads: 4; OS: Windows 10.0; UI render: GL; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

with Devanagari (0x0900-0x097f) coverage in Microsoft provided Nimala UI, and installed Code 2000, and Adobe Devanagari. 

Only picking up the first in a sequence of combining glyphs.

Happens with both default GDI rendering, and with OpenGL rendering. 

Are multiple combining glyphs for CTL handled in HarfBuzz?
Comment 6 Khaled Hosny 2019-10-07 00:22:31 UTC
I can reproduce with hb-view utility from HarfBuzz, so it is a HarfBuzz issue. Please report on https://github.com/harfbuzz/harfbuzz/issues
Comment 7 Shriramana Sharma 2019-10-15 03:50:30 UTC
Reported upstream as https://github.com/harfbuzz/harfbuzz/issues/2017.