Bug Hunting Session
Bug 49885 - sync custom breakiterator rules with icu originals
Summary: sync custom breakiterator rules with icu originals
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Master old -3.6
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyInteresting, easyHack, skillCpp, topicCleanup
Depends on:
Blocks:
 
Reported: 2012-05-13 14:54 UTC by Caolán McNamara
Modified: 2019-05-25 01:34 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Caolán McNamara 2012-05-13 14:54:17 UTC
http://cgit.freedesktop.org/libreoffice/core/tree/i18npool/source/breakiterator/data/README

We have a bunch of breakiterator rules that are used to find the right place to break a line or word etc.

They are all derived from originals bundled into icu, the "master" versions can be found via 
svn checkout
http://source.icu-project.org/repos/icu/icu/trunk/source/data/brkitr 
(They no longer appear in the icu tarballs, but are in icu's svn)

At various stages these copies have been customized and are now horribly out of sync. It's unclear which diffs from the base versions are deliberate and which are now accidental :-(

What's needed is a review of the various issues referenced in the commits to our breakiterator rules that caused customizations and see if those are still relevant or overtaken by changes in later unicode specifications. Ideally then writing regression tests for them (see i18npool/qa/cppunit/test_breakiterator.cxx) and if any are still relavant then apply those changes back on top of the latest versions from icu, otherwise simply drop the rules entirely and fall directly back to build-in icu ones.
Comment 1 Björn Michaelsen 2013-10-04 18:47:15 UTC
adding LibreOffice developer list as CC to unresolved EasyHacks for better visibility.

see e.g. http://nabble.documentfoundation.org/minutes-of-ESC-call-td4076214.html for details
Comment 2 Robinson Tryon (qubit) 2015-12-14 07:02:12 UTC Comment hidden (obsolete)
Comment 3 Robinson Tryon (qubit) 2016-02-18 14:52:24 UTC Comment hidden (obsolete)