Bug 43740 - Improper justification for hieroglyphics outside BMP
Summary: Improper justification for hieroglyphics outside BMP
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.4.4 release
Hardware: x86 (IA32) Windows (All)
: medium normal
Assignee: Mark Hung
URL:
Whiteboard: target:5.3.0
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-12 01:28 UTC by sanada
Modified: 2016-11-17 07:09 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
ODT included surrogate pair. (16.18 KB, application/vnd.oasis.opendocument.text)
2011-12-12 01:28 UTC, sanada
Details
explanation drawing (74.94 KB, image/png)
2012-01-22 17:04 UTC, sanada
Details
Surrogate Pair test file (8.56 KB, application/vnd.oasis.opendocument.text)
2014-12-26 01:09 UTC, Shinji Enoki
Details
Screenshot of test file (52.75 KB, image/png)
2014-12-26 01:29 UTC, Shinji Enoki
Details

Note You need to log in before you can comment on or make changes to this bug.
Description sanada 2011-12-12 01:28:49 UTC
Created attachment 54354 [details]
ODT included surrogate pair.

Probably the letter of the surrogate pair is counted in two characters.

I attach ODT file.
Comment 1 Florian Reisinger 2012-01-21 06:23:39 UTC
Maybe I am the only one, who does not get it, but could you explain it a bit more what is wrong...
Comment 2 sanada 2012-01-22 17:04:10 UTC
Created attachment 56004 [details]
explanation drawing

please refer to explanation drawing.
Comment 3 sanada 2012-01-22 17:30:46 UTC
I explain it with java as an example.

if treat letter of the surrogate pair, any letter is type of 'char'.
In contrast , any letter of the surrogate pair have to treat type of 'int'.


if count number of letters of string of characters,
former is String.length(), the latter is String.codePointCount(beginIndex, endIndex).


http://d.hatena.ne.jp/t_gaisho/20101112/p1



However, that sure to use java.text.BreakIterator when closely count about the latter.

http://d.hatena.ne.jp/t_gaisho/20101124/p1
Comment 4 sanada 2012-01-22 17:31:34 UTC
for JAPANESE:
-------------

javaを例にして説明します。

サロゲートペア文字を扱う場合、任意の文字はchar型であるのに対して、任意のサロゲートペア文字はint型で扱う必要があります。

文字列の文字数をカウントする場合、前者はString.length()、後者はString.codePointCount(beginIndex,endIndex)を使用する必要があります。

http://d.hatena.ne.jp/t_gaisho/20101112/p1


しかし、後者に関しては厳密にカウントする場合にはjava.text.BreakIteratorを利用することが確実です。

http://d.hatena.ne.jp/t_gaisho/20101124/p1
Comment 5 sanada 2012-01-30 18:13:43 UTC
LibO 3.5.0 RC-2 isn't improved, too.
Comment 6 Florian Reisinger 2012-08-14 14:01:06 UTC Comment hidden (obsolete)
Comment 7 Florian Reisinger 2012-08-14 14:02:13 UTC Comment hidden (obsolete)
Comment 8 Florian Reisinger 2012-08-14 14:06:51 UTC Comment hidden (obsolete)
Comment 9 Florian Reisinger 2012-08-14 14:08:55 UTC Comment hidden (obsolete)
Comment 10 Urmas 2013-11-27 14:40:55 UTC
Confirmed with master in Windows.
Comment 11 Joel Madero 2014-11-06 17:32:58 UTC
Never confirmed by QA - moving to UNCONFIRMED.
Comment 12 Shinji Enoki 2014-12-26 01:01:04 UTC
I reproduced.

LibO: 4.3.4 , 4.4.0 rc1
OS: Debian wheezy
Comment 13 Shinji Enoki 2014-12-26 01:09:45 UTC
Created attachment 111342 [details]
Surrogate Pair test file

This file setting.
1. Menu Format - Paragraph
2. Click "Alignment" tab
3. Select Justified - Jistified
Comment 14 Shinji Enoki 2014-12-26 01:24:04 UTC
reproduce step

1. Install Japanse Fonts
  Debian or Ubuntu
   $ sudo aptitude install fonts-ipaexfont-gothic
  Win or Mac
   Download and install file.
   http://ipafont.ipa.go.jp/index.html#en
2. start LibreOffice
3. Open attachment 111342 [details] file

Surrogate pair characters count as width of 2 characters.
Comment 15 Shinji Enoki 2014-12-26 01:29:49 UTC
Created attachment 111343 [details]
Screenshot of test file
Comment 16 Shinji Enoki 2014-12-26 01:35:41 UTC
Reference: 
http://www.unicode.org/versions/Unicode6.2.0/ch02.pdf
Comment 17 QA Administrators 2016-01-17 20:01:44 UTC Comment hidden (obsolete)
Comment 18 Commit Notification 2016-10-15 21:13:24 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dcef76b34aa1dca8389b3c068dc3d82a11d2c382

tdf#43740 Count CJK characters to distribute spaces.

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2016-10-15 21:21:56 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bd041161f3dc65a36245ce271007dce003529a9c

tdf#43740 Do not use UniscribeLayout for CJK Ideograph Variations.

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Commit Notification 2016-10-17 07:46:19 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6130ff73347b5e633babf9555ee1417462cc11ef

tdf#43740 Don't add space after ininvisible characters.

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Commit Notification 2016-11-17 07:09:47 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=53778372a269da7c51958a7e234df4d41027fb77

tdf#43740 SimpleWinLayout::LayoutText only advance position for actual glyphs.

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.