Bug 81144 - Chinese full-width punctuation does not align properly
Summary: Chinese full-width punctuation does not align properly
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.4.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:5.1.0 target:5.0.4
Keywords:
Depends on:
Blocks: CJK
  Show dependency treegraph
 
Reported: 2014-07-10 04:55 UTC by astyh83
Modified: 2016-09-16 10:07 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Attached is a sample document containing the bug. The same sample has been used for bug 80788 to demonstrate slowness in typing. (62.83 KB, application/vnd.oasis.opendocument.text)
2014-07-10 04:55 UTC, astyh83
Details
Before.png (48.25 KB, image/png)
2014-08-05 17:25 UTC, Matthew Francis
Details
RenderingBug.odt (14.64 KB, application/vnd.oasis.opendocument.text)
2014-08-05 17:27 UTC, Matthew Francis
Details
RenderingBug.png (35.92 KB, image/png)
2014-08-05 17:28 UTC, Matthew Francis
Details
RenderingBugMinimised.odt (10.46 KB, application/vnd.oasis.opendocument.text)
2014-08-05 17:28 UTC, Matthew Francis
Details
RenderingBugMinimised.png (4.66 KB, image/png)
2014-08-05 17:29 UTC, Matthew Francis
Details
Images showing rendering in 4.4 master and MS Word (2011, Mac) (2.40 MB, application/zip)
2014-08-16 10:33 UTC, Matthew Francis
Details
Punctuation that crowded together. (19.99 KB, application/vnd.oasis.opendocument.text)
2015-10-21 14:23 UTC, Mark Hung
Details
Comparison of font design, showing difference between Japanese and Chinese font for ideographic comma and fullstop (96.35 KB, image/png)
2015-10-23 15:56 UTC, Mark Hung
Details

Note You need to log in before you can comment on or make changes to this bug.
Description astyh83 2014-07-10 04:55:44 UTC
Created attachment 102517 [details]
Attached is a sample document containing the bug. The same sample has been used for bug 80788 to demonstrate slowness in typing.

When typing in Chinese, we normally use full-width punctuation which do not follow with a space before the next character. 

Writer curiously renders most of these correctly, but some tend to align more to the right, making the document look ugly. 

I have tried changing the font, which does alter certain punctuation positions, but did not help solve the problem. 

I believe this to be a bug specific to LibreOffice because the same situation doesn't happen in Word. 

In the sample document below, problematic punctuation positioning are highlighted and given in red (not exhaustive). Compare with the normal ones to get a gist of how the bug is affecting full-width punctuation display.
Comment 1 Kevin Suo 2014-07-10 05:17:04 UTC
Confirmed. This bug exist for a long time, every Chinese user can observe this.
Comment 2 astyh83 2014-07-10 08:04:47 UTC
Is there a plan to debug it? This bug makes working in Chinese frustrating.
Comment 3 Kevin Suo 2014-07-10 08:18:12 UTC
(In reply to comment #2)
> Is there a plan to debug it? This bug makes working in Chinese frustrating.

Could you find a way (step by step) to reproduce this issue?
Comment 4 astyh83 2014-07-10 08:39:45 UTC
No. The bug seems to occur randomly. 

What I did when I got hit by the bug, though, was to open a .doc file which was created on Word 2003 in Writer. 

Converting the .doc file to a .odt format did not help.
Comment 5 Matthew Francis 2014-08-05 17:24:15 UTC
One part of what is wrong with this file appears to be the issue in bug 82018 - specifically the "CharacterCompressionType" property being set to "1" within settings.xml inside the document. This causes some of the characters in the document to be rendered on top of one another.

However, when this is eliminated, there are still visible issues with the highlighted "。", so there's clearly a separate problem with punctuation here.

I will attach a series of files after this comment which illustrate the issue more minimally.

(1) An image showing buggy rendering of one paragraph in the original document. Note the overlapping characters at the end of the second line after the "。"
(2) An ODT of the same paragraph after the "CharacterCompressionType" issue is removed
(3) An image of the above (2). Note the misplaced "。"
(4) A stripped and minimised version of the above ODT
(5) An image of the above (4). Note that the last two characters overlap
(images rendered on OSX/LO 4.3.0.4)


Given that in the last version of the document, there is no page, paragraph or character formatting at all apart from the fact that a non-existent font has been set, perhaps this remaining issue is with the font fallback mechanism?
Setting the font of the text to one that exists appears to eliminate the rendering problem.
Comment 6 Matthew Francis 2014-08-05 17:25:55 UTC
Created attachment 104084 [details]
Before.png

(1) An image showing buggy rendering of one paragraph in the original document. Note the overlapping characters at the end of the second line after the "。"
Comment 7 Matthew Francis 2014-08-05 17:27:08 UTC
Created attachment 104085 [details]
RenderingBug.odt

(2) An ODT of the same paragraph after the "CharacterCompressionType" issue is removed
Comment 8 Matthew Francis 2014-08-05 17:28:12 UTC
Created attachment 104086 [details]
RenderingBug.png

(3) An image of the above (2). Note the misplaced "。"
Comment 9 Matthew Francis 2014-08-05 17:28:59 UTC
Created attachment 104087 [details]
RenderingBugMinimised.odt

(4) A stripped and minimised version of the above ODT
Comment 10 Matthew Francis 2014-08-05 17:29:40 UTC
Created attachment 104088 [details]
RenderingBugMinimised.png

(5) An image of the above (4). Note that the last two characters overlap
Comment 11 Matthew Francis 2014-08-16 10:33:32 UTC
Created attachment 104727 [details]
Images showing rendering in 4.4 master and MS Word (2011, Mac)

A couple of unrelated bugs have been fixed in 4.4 master, so the originally reported bug can now be seen more clearly. Please disregard the test documents and images I previously attached.

A workaround for the reported problem is to set "Options – Language Settings – Asian Layout – Character Spacing" to "No compression".


The images in the attached zip show some of the originally attached text rendered with and without character compression in 4.4 master in various fonts, and in MS Word (2011, Mac).

As I see it, the images show three issues remaining:
1) Only when "Character Spacing" is set to something other than "No compression", the compressed punctuation is not always properly aligned within its space allocation. This occurs in some fonts but not others
2) The compression applied seems much stronger than in MS Word. For instance, in the attached sample images, punctuation in Word is compressed from 26 screen pixels variously down to a minimum of 22 (~85% size), whereas in Writer the same punctuation is compressed from 30 pixels down to as few as 15 (50% size). Although I'm not an expert in Asian typography, this seems like too much.
3) A consistent fallback font isn't being selected for the missing font in the original document. The characters have a higgledy-piggledy look as though selected from several different fonts.
Comment 12 Mark Hung 2015-10-21 14:23:21 UTC
Created attachment 119833 [details]
Punctuation that crowded together.

Space at the right side of ideographic comma (、, unicode 3001) and ideographic fullstop (。,unicode 3002) are removed when compressed. The width of removed space is fix proportion of the font height. In the extreme case, they crowded with the following characters ( see uploaded example. ) I guess the algorithm was originally designed for Japanese.

Ideographic fullstops in most Chinese fonts are centered. Removing space from one side make it visually unbalaced. But how about Japanese font?
Comment 13 Mark Hung 2015-10-23 15:56:23 UTC
Created attachment 119911 [details]
Comparison of font design, showing difference between Japanese and Chinese font for ideographic comma and fullstop


Comparison explains why punctuation compression perform badly for some font but not the others. Ideographic comma and period in Japanese font are aligned closer to the left. Removing the space at the right just make the next character closer. For those in Chinese font, removing the space at the right make remaining space unbalanced, sometimes crowd the symbol with next character.
Comment 14 Commit Notification 2015-11-03 20:03:12 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=281be263619a8e513a26e6a9165d1d77cf6524ea

tdf#81144 Chinese full-width punctuation does not align properly

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2015-11-10 11:50:48 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f5ed3a29e995152b80bf1adc888094d735a0882c&h=libreoffice-5-0

tdf#81144 Chinese full-width punctuation does not align properly

It will be available in 5.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Xisco Faulí 2016-09-15 22:22:44 UTC
Hello,
Is this bug fixed?
If so, could you please close it as RESOLVED FIXED?