Currently for languages that do not have spaces between words (like Thai, Khmer) when the ICU breakiterator is used to break words, end users cannot see where the breakiterator is breaking the words. This creates a problem because the breakiterator is not 100% accurate for Thai or Khmer, so users cannot easily manually add Unicode joiners to re-join words that have been incorrectly split.
I suggest that the zero-width spaces added by the ICU breakiterator be made visible when a user has turned on View->Field Shadings
This way users can see exactly what is happening and easily correct any problems with the automatic word-breaker.
Operating System: All
Version: 220.127.116.11 release
Seems to be a valid enhancement
I'm not certain, but I don't think the ICU BreakIterator actually adds ZWSP to the text. Rather, it simply decides where to break the text when outputting, without actually inserting new characters into the text stream.
But this has gotten me thinking:
I'm unhappy with the ICU BreakIterator for Khmer because it creates chaos when non-Khmer (i.e. minority) languages are written in Khmer script. Before the ICU BreakIterator was enabled for Khmer in LO 3.6, minority languages carefully typed with ZWSP between words did line breaking perfectly.
It's understandable that Cambodians don't like typing ZWSP, but what if we inserted ZWSP automatically, using an interface similar to the predictive text input on an iPhone? If this were done, then line breaking, spell checking, word counts, etc. would be greatly simplified.
Something like that might work, but for now I would request this feature goes through as the years pass quickly without any resolution :)