Created attachment 198547 [details] List of common typos The current Thai spell checking is of poor quality, mainly caused by the dictionary-based word segmentation before checking, which works not quite well with misspelled words, and the incomplete word fragments passed to the spell checker is thus insufficient information by nature. However, assuming that the word boundaries were not the problem, there would still be certain classes of typos whose correct spellings are not suggested. And I will focus on this improvement in this bug. Let's discuss on the word boundary problem in another bug. Some examples of such typos whose correct spellings are not suggested: - กระเฌอ กะเชอ กะเฌอ (correct: กระเชอ) - กะเลวกะลาด (correct: กเฬวราก) - กอร์ป (correct: กอปร) - กาชาติ (correct: กาชาด) - การะบูน (correct: การบูร) - กุมภกัณฐ์ (correct: กุมภกรรณ) - เกษา (correct: เกศา) - เกาท์ (correct: เกาต์) - ขบฏ (correct: ขบถ) - คะมักคะเม่น (correct: ขะมักเขม้น) - ข้าวโภช (correct: ข้าวโพด) - ขี้เฒ่า ขี้เท่า (correct: ขี้เถ้า) - คันลอง คัลลอง (correct: ครรลอง) And many others. I have created a list of common typos with the words/phrases separated by spaces for testing using 'hunspell -d th_TH <file>' command line. The first word of each line is the correct spelling, and the rests are typos. Expected result: the first word should be included in the suggestion list. Actual result: some entries fail to suggest, some are OK.
Did you try ph tag in your .dic file? It will always suggest the correct word. For e.g. กาชาด ph: กาชาติ การบูร ph: การะบูน กุมภกรรณ ph: กุมภกัณฐ์
(In reply to Shantanu from comment #1) > Did you try ph tag in your .dic file? It will always suggest the correct > word. For e.g. > > กาชาด ph: กาชาติ > การบูร ph: การะบูน > กุมภกรรณ ph: กุมภกัณฐ์ Yes, I've tried it in some previous patch, and will address these cases with it. Thanks for mentioning it. And any other suggestions are welcome.
Proposed patch in gerrit: https://gerrit.libreoffice.org/c/dictionaries/+/180311