We have a bunch of breakiterator rules that are used to find the right place to break a line or word etc.
They are all derived from originals bundled into icu, the "master" versions can be found via
(They no longer appear in the icu tarballs, but are in icu's svn)
At various stages these copies have been customized and are now horribly out of sync. It's unclear which diffs from the base versions are deliberate and which are now accidental :-(
What's needed is a review of the various issues referenced in the commits to our breakiterator rules that caused customizations and see if those are still relevant or overtaken by changes in later unicode specifications. Ideally then writing regression tests for them (see i18npool/qa/cppunit/test_breakiterator.cxx) and if any are still relavant then apply those changes back on top of the latest versions from icu, otherwise simply drop the rules entirely and fall directly back to build-in icu ones.
adding LibreOffice developer list as CC to unresolved EasyHacks for better visibility.
see e.g. http://nabble.documentfoundation.org/minutes-of-ESC-call-td4076214.html for details
Migrating Whiteboard tags to Keywords: (EasyHack SkillCpp DifficultyInteresting TopicCleanup)
JanI is default CC for Easy Hacks (Add Jan; remove LibreOffice Dev List from CC)