Created attachment 191134 [details] Patch to add Thai Autocorrect data I would like to add Autocorrect data for Thai in which common misspelled words are corrected. As Thai script has no word delimiter, the matching patterns will be with both left and right wildcards so that words are matched at any position in a text chunk and fixed. For example, ".*กงศุล.*" -> "กงสุล" will fix a text chunk "สถานกงศุลใหญ่" to "สถานกงสุลใหญ่". This, however, may require additional adjustment to the current matching behavior to make it more complete. The current implementation stops immediately when the first pattern is matched. This means only one replacement will take place even though there can be more than one typos in the text chunk. For example, suppose there are only 2 rules in the Autocorrect rule set: - ".*กงศุล.*" -> "กงสุล" - ".*อนุญาติ.*" -> "อนุญาต" and the input text chunk is composed of 2 typos: "ขออนุญาติจากสถานกงศุลใหญ่". Assuming that the rules are matched in order, only the first rule will be matched in current implementation, and the text chunk becomes: "ขออนุญาติจากสถานกงสุลใหญ่" although the desired result is: "ขออนุญาตจากสถานกงสุลใหญ่" where both typos are fixed. So, I'm proposing 2 patches, one for the data, and the other for the code.
Created attachment 191135 [details] Patch to make Autocorrect match more than one rule
Gerrit commits to be reviewed: - Add Thai AutoCorrect data https://gerrit.libreoffice.org/c/core/+/160159 - SvxAutoCorrDoc::ChgAutoCorrWord() implementations: correct multiple patterns https://gerrit.libreoffice.org/c/core/+/160160
Hello Jonathan, I thought you may have better insight on this issue, and possibly review the submitted patch.
Theppitak Karoonboonyanan committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/76c96ca7c9a6e0d847ec5dc186c6e47ab6061f5f tdf#158454 Add Thai Autocorrect Support, coding part It will be available in 25.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Hello Theppitak Karoonboonyanan, I believe the outstanding work for this bug has been completed. If you agree, please mark this bug fixed. Thanks!
(In reply to Jonathan Clark from comment #5) > I believe the outstanding work for this bug has been completed. If you > agree, please mark this bug fixed. Yes. Thanks for the reminder! I'm closing it.
There are certain words in the Marathi language where multiple autocorrect rules can be applied. The patch is functioning exceptionally well. Thank you for this outstanding contribution. Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: 0fb23379e63071ec155cb6683c19212859e399b5 CPU threads: 1; OS: Windows 10 X86_64 (10.0 build 14393); UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded
(In reply to Shantanu from comment #7) > There are certain words in the Marathi language where multiple autocorrect > rules can be applied. The patch is functioning exceptionally well. Thank you > for this outstanding contribution. Thank you for the feedback. I'm pleased to learn that.
For users of an older version of LibreOffice who wish to apply more than one autocorrect rule, simply use Tools - Autocorrect - Apply two or three times consecutively. Most users apply autocorrect only once, as they are likely unaware of this issue. However, while typing, only a single rule will be applied because, as noted in the initial post, "The current implementation stops immediately when the first pattern is matched." I appreciate this important enhancement to the autocorrect functionality.