Created attachment 72979 [details]
ODT file demonstrating the bug.
Problem description: A typographical ligature is defined in http://en.wikipedia.org/wiki/Ligature_%28typography%29.
Steps to reproduce:
1. Write text in french (making sure styling meta-data says it's french text).
2. Search stings "fi" and "ffi" and replace them with their ligatured unicode characters U+FB01 and U+FB03, respectively, in preparation for publication.
3. Start spellcheck.
Spellcheck correctly understands ligatures in english. Fails to recognize words containing "fi" and "ffi" ligatures in french. However, searching of words with or without the ligature works correctly.
The spellchecker should translate ligatures to their decomposed form before performing the check. Alternatively, the dictionary should contain words in both their ligatured or decomposed forms, whatever LibreOffice does with english text.
Operating System: Linux (Other)
Version: 22.214.171.124 release
You are not supposed to use presentation forms for ligatures (FB0x). They should be substituted automatically instead of literal "fi" and "ffi" combinations.
You can extend the French dictionary with Unicode ligature support like the English dictionary using Unicode (UTF-8) encoded dictionary and adding the following definition to affix file:
ICONV ’ '
ICONV ﬃ ffi
ICONV ﬄ ffl
ICONV ﬀ ff
ICONV ﬁ fi
ICONV ﬂ fl
OCONV ' ’
(Apostrophes are maybe not required.)
I suggest to use use Graphite fonts with automatic ligature replacement: Linux Libertine G and Linux Biolinum G (shipped with LibreOffice) or SIL fonts.
(By the way, the Hungarian Lightproof module contains a historical option to underline the words with ff, fl, fi and replace with f-ligatures, if needed. Interestingly, this function helps to edit a social science journal, but the aim is to support OpenType in LibreOffice.)
Dear László Németh,
Thank you for your reply.
I successfully added ligature support to my dictionary by following your directions and I have forwarded your recommendation upstream :
Alexandre de Verteuil.
Languages with 8-bit encoded spelling dictionaries (eg. German, Spanish) now accept the words with Unicode f-ligatures or ZWNJ, ZWJ characters, see http://cgit.freedesktop.org/libreoffice/core/commit/?id=98029f1625663609d670f79eea61f7547bfc8123
@Alexandre: thanks for your bug report. With the new patch, it is unnecessary to convert the dictionaries to UTF-8, and only UTF-8 encoded dictionaries need extra options or extension to recognize the words with Unicode f-ligatures and ZWNJ/ZWJ characters.