Bug Hunting Session
Bug 48191 - Improper choice of features based on text language.
Summary: Improper choice of features based on text language.
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.2 release
Hardware: All All
: medium major
Assignee: Caolán McNamara
URL:
Whiteboard: target:3.7.0
Keywords:
Depends on:
Blocks: 53584
  Show dependency treegraph
 
Reported: 2012-04-02 05:32 UTC by Steve White
Modified: 2012-08-16 10:48 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Shows failure to apply features for default language in Hindi etc. (18.51 KB, application/octetstream)
2012-04-02 05:32 UTC, Steve White
Details
the source LO document showing the problem (80.00 KB, application/vnd.msword)
2012-08-11 14:53 UTC, Steve White
Details
HTML file with same text (should work right in FireFox) (1.58 KB, text/html)
2012-08-11 14:54 UTC, Steve White
Details
Hindi text comparison (159.94 KB, image/png)
2012-08-11 14:55 UTC, Steve White
Details
Sinhala text comparison (40.53 KB, image/png)
2012-08-11 14:56 UTC, Steve White
Details
how the test doc appears on my system (275.54 KB, image/png)
2012-08-11 15:46 UTC, Steve White
Details
screenshot from master (180.65 KB, image/png)
2012-08-14 09:09 UTC, Caolán McNamara
Details
Some remaining problems in Hindi text (180.71 KB, image/png)
2012-08-14 13:25 UTC, Steve White
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve White 2012-04-02 05:32:13 UTC
Created attachment 59375 [details]
Shows failure to apply features for default language in Hindi etc.

Hi,

I'm using a modification from GIT pull with Caolán's patch from
https://bugs.freedesktop.org/show_bug.cgi?id=31821
With this, a lot of OpenType features are working, that were not working before.

However, it may have uncovered a different problem.
See the attached file, which compares text in the Indic languages Sinhala and Hindi, from FreeSerif and Lohit Hindi.

The Sinhala text in FreeSerif is largely right.  This means a lot of font features are being applied (it would be garbled otherwise, with overlapping parts of letters etc).

The Hindi text from FreeSerif is wrong.  It looks as if no features are being applied.  Notice diagonal strokes below the letters, and higher strokes appear beside but disconnected with the letters.  Compare this text to the same text in FreeSans and Lohit Hindi -- instead of strokes below, the corresponding letters are transformed, and the higher strokes are placed over the letters.

This is my guess as to what is happening.

The Sinhala range in FreeSerif, and the Hindi range in Lohit Hindi, are activated by script{language} 
    sinh{dflt} 
and 
    deva{dflt} 
respectively.

However, in FreeFont a distinction is made between the default language (Hindi) and Sanskrit (both using the Devanagari script).  Most features are activated by
   dev2{SAN ,dflt} deva{SAN ,dflt}
However, in FreeSerif only there are some features activated only for Sanskrit by
   dev2{SAN } deva{SAN }

In this text, no language was specified (or rather, it's set to the default of English).
So swriter should apply the Devanagari features marked 'dflt'.

Instead, it's applying *no* Devanagari features.  
This is evidently because some features for that script are specified to activate for a specific language.

Pango and Firefox get this right.
Comment 1 Steve White 2012-04-02 06:00:22 UTC
Forgot to mention: this is using the development version of FreeFont.
Comment 2 leighman 2012-08-06 19:00:42 UTC
Hi Steve,

Could you please provide an example of what a correct sample in Firefox looks like as well as the version of FreeFont that you are using.

Subscribing caolanm for font-fu
Comment 3 Steve White 2012-08-11 14:53:32 UTC
Created attachment 65429 [details]
the source LO document showing the problem
Comment 4 Steve White 2012-08-11 14:54:41 UTC
Created attachment 65430 [details]
HTML file with same text (should work right in FireFox)
Comment 5 Steve White 2012-08-11 14:55:30 UTC
Created attachment 65431 [details]
Hindi text comparison
Comment 6 Steve White 2012-08-11 14:56:13 UTC
Created attachment 65432 [details]
Sinhala text comparison
Comment 7 Steve White 2012-08-11 14:57:03 UTC
Hi,

I now see the PDF was not a good idea.  I thought the fonts would be embedded, and some were, but others weren't.

I'll replace this with freshly-made images, as well as the original documents.

These are made with distro LO 3.5.2.  

Unfortunately, since the bug was reported, I lost my build with the patch, and I don't know if this LO has the patch I mentioned.  It doesn't really matter: as I said, the issue at hand seems to be independent of the patch.

Cheers!
Comment 8 Steve White 2012-08-11 15:46:17 UTC
Created attachment 65437 [details]
how the test doc appears on my system

Better illustration of the problem than the previous images.
Comment 9 Caolán McNamara 2012-08-14 09:09:10 UTC
Created attachment 65535 [details]
screenshot from master
Comment 10 Caolán McNamara 2012-08-14 09:10:49 UTC
Using freefont-otf-20120503 the above is now the screenshot I get with todays master. Looks good to me now.
Comment 11 Steve White 2012-08-14 11:03:05 UTC
I concur that your image looks much better -- the egregious errors I was seeing before are gone.

I'm still struggling to get the latest pull built, so I can't test it myself...

Thanks VERY MUCH for all your work on this!
Comment 12 Steve White 2012-08-14 13:25:59 UTC
Created attachment 65545 [details]
Some remaining problems in Hindi text
Comment 13 Steve White 2012-08-14 13:27:45 UTC
I had Zdenek Wagner look at the latest, and he found
a couple of problems remaining in the Hindi text.
See attachment.
The more serious (in red) is a character replacement
that still isn't happening.  The other is a spacing issue.

This may be a separate bug, I don't know.
It will take me some time to sort out what's going on here.

Nobody said Indic scripts were going to be easy...
Comment 14 Steve White 2012-08-15 19:11:15 UTC
Harshula remarked to me that in your image, the second word in the Sinhala
  නු
seems to be missing a thin vertical line on its right side.

(Otherwise he says the text is right.)

Is this just an artifact of making the image, or is it in the original rendering of the word?
Comment 15 Caolán McNamara 2012-08-16 10:44:04 UTC
re comment #14, this vertical line on right side of නු exists on screen. Presumably an artefact of scaling the png or some such
Comment 16 Caolán McNamara 2012-08-16 10:48:20 UTC
re #13 lets follow up in bug #53584