Description: I'm currently testing Unicode compatibility including different input methods on Windows and Linux. There are two ways to represent the German 'ö': either use the single unicode character U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS) or have an 'o' + U+0308 (COMBINING DIAERESIS). Using Unicode normalization (e.g. NFC), these are considered equal, but for LO they are different. LO 5.3 added an option to the search and replace dialog to ignore diacritics generally. This can be used as a kind of workaround for search, but doesn't help with replace, as this also matches 'o'. Actually I also tested gedit and kate and only gedit finds both matches in the "ööo"-string. kate in KDE4 at least loads the text correctly, while KF5 loads it as "öoö" :-( I just tested 4.1.6, but I guess it's inherited from OOo. Steps to Reproduce: Open document and search for ö in 'ööo'. Actual Results: You get one or three matches, depending on the "ignore diacritics" setting. Expected Results: You should get two or three matches depending on the "ignore diacritics" setting. Reproducible: Always User Profile Reset: No Additional Info: User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:51.0) Gecko/20100101 Firefox/51.0
Created attachment 131044 [details] A document with the test string ööo
Assume we would continue to use ICU for doing this. =-ref-= http://www.icu-project.org/userguide/normalization http://userguide.icu-project.org/transforms/normalization