Bug Hunting Session
Bug 59337 - EDITING: Lack of "fi" and "ffi" typographical ligatures support in auto-correct feature in french languages causes false positives.
Summary: EDITING: Lack of "fi" and "ffi" typographical ligatures support in auto-corre...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
3.6.4.3 release
Hardware: Other Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-14 04:05 UTC by Alexandre de Verteuil
Modified: 2013-02-08 08:50 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
ODT file demonstrating the bug. (13.19 KB, application/vnd.oasis.opendocument.text)
2013-01-14 04:05 UTC, Alexandre de Verteuil
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexandre de Verteuil 2013-01-14 04:05:20 UTC
Created attachment 72979 [details]
ODT file demonstrating the bug.

Problem description: A typographical ligature is defined in http://en.wikipedia.org/wiki/Ligature_%28typography%29.

Steps to reproduce:
1. Write text in french (making sure styling meta-data says it's french text).
2. Search stings "fi" and "ffi" and replace them with their ligatured unicode characters U+FB01 and U+FB03, respectively, in preparation for publication.
3. Start spellcheck.

Current behavior:
Spellcheck correctly understands ligatures in english. Fails to recognize words containing "fi" and "ffi" ligatures in french. However, searching of words with or without the ligature works correctly.

Expected behavior:
The spellchecker should translate ligatures to their decomposed form before performing the check. Alternatively, the dictionary should contain words in both their ligatured or decomposed forms, whatever LibreOffice does with english text.
              
Operating System: Linux (Other)
Version: 3.6.4.3 release
Comment 1 Urmas 2013-01-15 13:33:26 UTC
You are not supposed to use presentation forms for ligatures (FB0x). They should be substituted automatically instead of literal "fi" and "ffi" combinations.
Comment 2 László Németh 2013-01-15 14:33:20 UTC
You can extend the French dictionary with Unicode ligature support like the English dictionary using Unicode (UTF-8) encoded dictionary and adding the following definition to affix file:

ICONV 6
ICONV ’ '
ICONV ffi ffi
ICONV ffl ffl
ICONV ff ff
ICONV fi fi
ICONV fl fl
OCONV 1
OCONV ' ’

(Apostrophes are maybe not required.)

I suggest to use use Graphite fonts with automatic ligature replacement: Linux Libertine G and Linux Biolinum G (shipped with LibreOffice) or SIL fonts.

(By the way, the Hungarian Lightproof module contains a historical option to underline the words with ff, fl, fi and replace with f-ligatures, if needed. Interestingly, this function helps to edit a social science journal, but the aim is to support OpenType in LibreOffice.)
Comment 3 Alexandre de Verteuil 2013-01-18 03:22:41 UTC
Dear László Németh,

Thank you for your reply.

I successfully added ligature support to my dictionary by following your directions and I have forwarded your recommendation upstream :
http://www.dicollecte.org/thread.php?prj=fr&t=317

Goodbye,

Alexandre de Verteuil.
Comment 4 László Németh 2013-02-08 08:45:54 UTC
Languages with 8-bit encoded spelling dictionaries (eg. German, Spanish) now accept the words with Unicode f-ligatures or ZWNJ, ZWJ characters, see http://cgit.freedesktop.org/libreoffice/core/commit/?id=98029f1625663609d670f79eea61f7547bfc8123
Comment 5 László Németh 2013-02-08 08:50:40 UTC
@Alexandre: thanks for your bug report. With the new patch, it is unnecessary to convert the dictionaries to UTF-8, and only UTF-8 encoded dictionaries need extra options or extension to recognize the words with Unicode f-ligatures and ZWNJ/ZWJ characters.