Created attachment 54539 [details]
Current behavior with new Hyphen
Version ID : 7362ca8-b5a8e65-af86909-d471f98-61464c4
Unwanted behaviors due to Hyphen 2.8.3 (2011/10/10) in hyphenated compounds with apostrophe.
sot=l’y=laisse > sotl'-[y=laisse
va=t’en > vat'-[en
sot=l’y=laisse > sot=[l’y=[laisse
va=t’en > va=[t'en
Workaround: explicite declararion of NEXTLEVEL in pattern dictionary file.
Did send a mail to Lázló Németh (2011/12/18).
It seems, the following explicite declaration fix the problem:
I suggest to remove all of your extra patterns with hyphens and apostrophes.
By the way, the recent implicit declaration, with the problematic hyphen replacement (it was the fix the old double hyphens at hyphenations of hard hyphen by Hyphen patterns, now I cannot reproduce this problem, maybe a parallel fix of LibreOffice resulted the problem with the French/etc. hyphenation):
I will fix it in the source of LibreOffice, too.
Created attachment 54636 [details]
Fixed in LibreOffice: http://cgit.freedesktop.org/libreoffice/core/commit/?id=4a3ca24020bdaa956acbefd911e688917c7fa3dd
Now the French hyphenation works well without explicite NOHYPHEN and NEXTLEVEL declarations in the hyphenation dictionary. See the attached test file.
(In reply to comment #3)
> Fixed in LibreOffice:
> Now the French hyphenation works well without explicite NOHYPHEN and NEXTLEVEL
> declarations in the hyphenation dictionary. See the attached test file.
Thank you, László! I will check this as soon as I can afford it.
By the way, I noticed that the NOHYPHEN parameter has been problematic from the start for the French words (if no explicit additional patterns) :-/
caolanm->László: FWIW I got a little confused in this area today. i.e. I bumped hyphen to version 2.8.3 in master and I believe I accidentally wiped out this fix (for master, not 3-5) because the change happened inside the hyphen-2.7.1-2.8.3 patch which I presumed just upgraded 2.7.1 to 2.8.3 :-(
This fix here isn't actually in hyphen-2.8.3 right ?, and it isn't in upstream hyphen CVS for to-be 2.8.4 either yet right ? (should it be ?)
I think I restored this specific fix as http://cgit.freedesktop.org/libreoffice/core/commit/?id=84897d4b3b2a0e4719b00fb06abb8c04e3c20c24
(In reply to comment #5)
nemeth->Caolán: This is a quick fix only for LibreOffice, yet. If this modification could result double hyphens by hyphenation near hard hyphens sporadically (I wasn't be able to reproduce this problem with the recent 3.5 master), I will limit this patch only for French (using explicit patterns) before searching the root of the apostrophe problem (in hyphen, lingucomponent or other module). Original hyphenation algorithms (the default hyphenation at hard hyphens and the libhyphen based hyphenation) had no conflict, but from OOo 3.3 libhyphen gets words with hyphens, too, and now we need to hyphenate at hard hyphens to fix the frequent missing hyphenation (not only at hard hyphens).
Thanks for preserving the patch. I will test it again in beta 2 and the recent source.
I have found double hyphens in my tests (eg. in the word "va-t’en-touil") and some other anomalies (eg. forced break in all position of "touil" without hyphens in va-t’en-touil), so I completely removed the libhyphen based breaks at hard hyphens:
The implicite declaration:
These patterns fix the bad hyphenation of words with hard hyphens (resulted by the different word boundaries), but now they don't fix the frequent missing hyphenation at hard hyphens (resulted by the competing hyphenation mechanisms).
> Original hyphenation algorithms (the default hyphenation at hard
> hyphens and the libhyphen based hyphenation) had no conflict, but from OOo
> 3.3 libhyphen gets words with hyphens, too, and now we need to hyphenate at
> hard hyphens to fix the frequent missing hyphenation (not only at hard
Sorry, the hyphenation could miss only hard hyphens, when there is enough space (for a libhyphen based search for a potential hyphenation break after the hard hyphen) and there is an appropriate hyphenation point before the hard hyphen. Eg. "eighteen-year-old" will be hyphenated as "eigh=teen-year-old" instead of the possible "eighteen=year-old", especially when there is more free place in the line.