Description: Unfortunately "t" and "y" incorrectly being interpreted as a "ty" when transliterating into Old Hungarian Steps to Reproduce: Thank you for the "Transliteration to Old Hungarian" feature which arrived in LibreOffice 7.0 https://wiki.documentfoundation.org/ReleaseNotes/7.0#Transliteration_to_Old_Hungarian Using the template at https://wiki.documentfoundation.org/images/8/8f/Sz%C3%A9kely_%C3%ADr%C3%A1s_sablondokumentum_be%C3%A1gyazott_Noto_bet%C5%B1k%C3%A9szlettel.ott there are some errors. We have tried it out with the following words: 1. Q=KV pl. Aquincum 𐲀𐳓𐳮𐳐𐳙𐳄𐳪𐳘 2. X=KSZ pl. taxi 𐳦𐳀𐳓𐳥𐳐 3. Y=I pl. Vörösmarty 𐲮𐳞𐳢𐳞𐳤𐳘𐳀𐳢𐳨 4. W=V, pl. Wesselényi 𐲮𐳉𐳤𐳤𐳉𐳖𐳋𐳚𐳐 The 3rd output seems incorrect as 𐲮𐳞𐳢𐳞𐳤𐳘𐳀𐳢𐳨 should be 𐲮𐳞𐳢𐳞𐳤𐳘𐳀𐳢𐳦𐳐? This bug issue will not show the Old Hungarian characters your computer if you have not installed them so an image will be attached to this issue. Thank you once again Actual Results: Wesselényi is being transliterated to 𐲮𐳞𐳢𐳞𐳤𐳘𐳀𐳢𐳨 Expected Results: Wesselényi should be transliterated to 𐲮𐳞𐳢𐳞𐳤𐳘𐳀𐳢𐳦𐳐 Reproducible: Always User Profile Reset: No Additional Info: Response from Németh László This is wrong, indeed. I started to add some exceptions to the Numbertext dictionary, see https://github.com/Numbertext/libnumbertext/blob/master/data/hu_Hung.sor but the real solution will be the planned replacement/combination of the pattern-based hyphenation to the dictionary+pattern based one. The recent Hungarian spelling dictionary has already had the required data: $ hunspell -d hu_HU -m Vörösmarty Vörösmarty st:Vörösmarty po:noun_prs ts:NOM hy:Vö-rös|mar-t.y Here the dot between t.y means that this is not the Hungarian letter "ty", but a "t" and "y" (spelled out as "i").
Created attachment 164965 [details] Image showing characters
@László Németh I think it's your case.
@László Németh @Óvári Sorry, I didn't look that it is already assigned. I just look the CC list
László Németh committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/e6165b7cac5d91458d61da3de35486cde3004897 tdf#136368 bump to libnumbertext 1.0.7 It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Are the any other "t" + "y" words that need fixing or should this issue be closed? Thank you
László Németh committed a patch related to this issue. It has been pushed to "libreoffice-7-1": https://git.libreoffice.org/core/commit/130445d231dc0c8af9148edd234f16424d0a16aa tdf#136368 bump to libnumbertext 1.0.7 It will be available in 7.1.1. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to Óvári from comment #5) > Are the any other "t" + "y" words that need fixing or should this issue be > closed? Thank you Common and less common exceptions are handled correctly now: ty: Feszty, Haraszty, Huszty, Mindszenty, Noszty, Pesty, Vörösmarty, city, zloty... ly: Áprily, Dolly, Ély, Hollywood, grizzly, Kéthly, Reguly, Thököly... ny: Hatvany, Sony, penny So we can close the issue. Thanks for reporting the problem!
Thanks for fixing. 1. Is "Batthyány" on the list? https://en.wikipedia.org/wiki/Batthy%C3%A1ny 2. Do you know of an Internet list of these exceptions? Thanks you once again.
I have reopened. I am going to fix it on dictionary-level to support the future extensions and more exceptions. @Óvári: thanks for the report!
@Óvári: You are right, “Batthány” is still missing. I am going to add it soon. I have already added the following ~100 exceptions to the Hungarian spelling dictionary, which will be the base of further refinement of the transliteration: Adriany, Áprily, Árokháty, assembly, Balatony, Belatiny, Blattny, Boháty, Boroviczény, Bölöny, Brooklyn, Champs-Élysées, city, Csernátony, Delly, Dicenty, Dolák-Saly, Dolly, Duronelly, ecstasy, Édeskuty, Élysée-palota, Fabiny, Feszty, Finály, Folly, grizzly, Haraszty, Hatvany, Hefty, Huszty, Illy, intercity, Istvány, jolly, Jóny, Kacziány, Kamilly, Kelety, Kereszty, Kerny, Kertbeny, Kéthly, Kétly, Kismarty, Kmety, Kmetty, Kukorelly, Lacsny, Lyme-kór, lymphocyta, Lyon, Mindszenty, Noszty, Novotny, Olty, Patrubány, Peéty, penny, Peregriny, Perity, Pesty, Pewny, Plymouth, Povolny, Purgly, Reguly, Rezsny, Rosty, royalty, Saágy, Saly, Schrotty, Schwotty, Serly, Sexty, Sony, Spergely, Splény, Stáhly, Szentkuty, Szentmártony, Szily, Szombaty, Sztevanovity, Thaly, Thököly, Veszely, Vizkelety, Vízkelety, Vlaszaty, Volny, Vörösmarty, Wény, Wessely, Weszely, Wolny, złoty, Zsivny
@Németh: Does "Horty" work correctly? https://hu.wikipedia.org/wiki/Soltszentimre Thank you once again
(In reply to Óvári from comment #11) > @Németh: Does "Horty" work correctly? > https://hu.wikipedia.org/wiki/Soltszentimre > > Thank you once again The wikipedia editors made a mistake: Horthy Miklós is the correct.
László Németh committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/98fd4fcdc61202846e0957cb6aaed9e4a2d2c520 tdf#136368 bump to libnumbertext 1.0.8 It will be available in 7.4.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Fixed in master, and in 7.3 soon with libnumbertext 1.0.8. Changes: - fix transliteration of old Hungarian family names, bug report by Zoltán Óvári - fix transliteration of numbers 100–199, 1000–1999, 1000000–1999999 and 1000000000–1999999999 (bad ordering) - fix conversion of single letters "í", "Í" and "NY"; - fix unnecessary conversion of words ending with "q", e.g. "IQ"; - fix unnecessary conversion of words not ending with unknown letters
László Németh committed a patch related to this issue. It has been pushed to "libreoffice-7-3": https://git.libreoffice.org/core/commit/118b6dddad55e00b1ae596db344c6672a1d4d4c3 tdf#136368 bump to libnumbertext 1.0.8 It will be available in 7.3.0.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to László Németh from comment #10) > @Óvári: You are right, “Batthány” is still missing. I am going to add it > soon. > > I have already added the following ~100 exceptions to the Hungarian spelling > dictionary, which will be the base of further refinement of the > transliteration: > > Adriany, Áprily, Árokháty, assembly, Balatony, Belatiny, Blattny, Boháty, > Boroviczény, Bölöny, Brooklyn, Champs-Élysées, city, Csernátony, Delly, > Dicenty, Dolák-Saly, Dolly, Duronelly, ecstasy, Édeskuty, Élysée-palota, > Fabiny, Feszty, Finály, Folly, grizzly, Haraszty, Hatvany, Hefty, Huszty, > Illy, intercity, Istvány, jolly, Jóny, Kacziány, Kamilly, Kelety, Kereszty, > Kerny, Kertbeny, Kéthly, Kétly, Kismarty, Kmety, Kmetty, Kukorelly, Lacsny, > Lyme-kór, lymphocyta, Lyon, Mindszenty, Noszty, Novotny, Olty, Patrubány, > Peéty, penny, Peregriny, Perity, Pesty, Pewny, Plymouth, Povolny, Purgly, > Reguly, Rezsny, Rosty, royalty, Saágy, Saly, Schrotty, Schwotty, Serly, > Sexty, Sony, Spergely, Splény, Stáhly, Szentkuty, Szentmártony, Szily, > Szombaty, Sztevanovity, Thaly, Thököly, Veszely, Vizkelety, Vízkelety, > Vlaszaty, Volny, Vörösmarty, Wény, Wessely, Weszely, Wolny, złoty, Zsivny @Németh: Several of these names and words still appear incorrectly. Remark: Sztevanovity is Sz-t-e-v-a-n-o-v-i-ty. It is the name of a musician whose paternal branch has Serbian ancestors. The truth is that we should correctly name Jelačić as "Jelacsity." That's Hungarianization, that we call him as "Jelasics"
Created attachment 177286 [details] An example of an Old Hungarian number. @László Német This number is readable in the "Arany János összes költeménye - All the poems of János Arany" e-book's page 2nd, downloadable from: https://mek.oszk.hu/05600/05694/05694.pdf
@László Németh: Sorry about spelling mistake of your surname in the previous comment.