U+3000 IDEOGRAPHIC SPACE, which is a wide space used in CJK text, does not show visibly as a non-printing character when View -> Non-printing Characters is enabled in Writer.
Please see the attached document, which contains various sorts of space (ensure that View -> Non-printing Characters is enabled).
Currently, U+0020 SPACE and U+00A0 NO-BREAK SPACE are rendered correctly, but there are various other sorts of Unicode space which are not. While U+3000 IDEOGRAPHIC SPACE is almost certainly the most used of these, perhaps consideration should be given to making all on this list of space characters visible:
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE
U+200B ZERO WIDTH SPACE
U+FEFF ZERO WIDTH NO-BREAK SPACE
(Interestingly, U+200B ZERO WIDTH SPACE shows as a sort of visible space whether or not View -> Non-printing Characters is enabled. Perhaps the handling of this should be unified with other non-printing characters?)
Created attachment 104700 [details]
Sample document with spaces
Created attachment 104701 [details]
Document rendered without non-printing characters enabled
Created attachment 104702 [details]
Document rendered with non-printing characters enabled
(In reply to comment #0)
> U+3000 IDEOGRAPHIC SPACE, which is a wide space used in CJK text, does not
> show visibly as a non-printing character when View -> Non-printing
> Characters is enabled in Writer.
There is certainly no Interpunct character displayed over the Ideographic Space (U+3000) when Non-printing characters are displayed. There are possibly cultural reasons for this, given that the Middle Dot (U+00B7), which is used for Space (U+0020) and No-break Space (U+00A0), is in the Basic Latin block and some Asian scripts use a centralised dot for a full stop.
According to http://en.wikipedia.org/wiki/Interpunct these are the main Asian language preferences:
Chinese: "In Taiwan the Unicode code point U+2027, Hyphenation Point, is recommended by government as a fullwidth punctuation to separate the given name and the family name of non-Chinese." and "In Chinese, the middle dot is also fullwidth in printed matter, but the regular middle dot (·) is used in computer input, which is then rendered as fullwidth in Chinese-language fonts."
Japanese: "Interpuncts are often used to separate transcribed foreign words written in katakana. [...] the Japanese writing system usually does not use space or punctuation to separate words." and "U+30FB ・ katakana middle dot" and "U+FF65 ･ halfwidth katakana middle dot."
Korean: "Interpuncts are used in written Korean to denote a list of two or more words, more or less in the same way a slash (/) is used to juxtapose words in many other languages." and "The use of interpuncts has declined in years of digital typography and especially in place of slashes, but, in the strictest sense, a slash cannot replace a middle dot in Korean typography." and "U+318D ㆍ hangul letter araea (아래아) is used more than a middle dot when a interpunct is to be used in Korean typography."
In accordance with this I am setting the status to NEEDINFO as Asian language (l10n) experts are required to comment further on what would be considered acceptable practice.
> U+FEFF ZERO WIDTH NO-BREAK SPACE
Please note that use of U+FEFF as ZWNBSP is deprecated since 2002 (Unicode v3.2) and the Word Joiner (U+2060) is recommended to be used in its place.
Thanks for the above comment.
Note that one mitigating factor to the other uses for • in CJK text is that, as of current master (4.4), the non-printing characters are displayed in blue text, rather than black, so there is some contrast there by default.
For comparison, Word for Mac 2011 appears to use a rectangle the width of the ideographic space for this case. This might be a reasonable model to follow.
Dear Bug Submitter,
This bug has been in NEEDINFO status with no change for at least
6 months. Please provide the requested information as soon as
possible and mark the bug as UNCONFIRMED. Due to regular bug
tracker maintenance, if the bug is still in NEEDINFO status with
no change in 30 days the QA team will close the bug as INVALID
due to lack of needed information.
For more information about our NEEDINFO policy please read the
wiki located here:
If you have already provided the requested information, please
mark the bug as UNCONFIRMED so that the QA team knows that the
bug is ready to be confirmed.
Thank you for helping us make LibreOffice even better for everyone!
I think this has all the information it needs - passing to ux-advise.
Could the UX team please evaluate this? Thanks
-> Status: NEW
-> Severity: enhancement
-> Component: ux-advise
The purpose of showing non-printable characters is to manage the text, e.g. to distinguish between repeated carriage return and paragraph space, to discriminate between spaces and tabs, or to identify multiple spaces.
However if the formatting information is shown directly by WYSIWYG means it makes no sense to clutter the document. In case of zero width non joiners in Farsi I understand the interaction as entering a character plus a ZWNJ which leads to a different letter - but I may be wrong. And according Owen's reply there might be some other reasons to not show special spaces. So why not having a configuration switch?
But we should confirm this by native speakers rather than UX. So I add Kevin Suo from the LO China Blog to the CC list.
(In reply to Heiko Tietze from comment #8)
Sorry, I have no much idea on this issue. The only thing I can be sure is that the U+3000 (full-width space) is seldomly used in Simplified Chinese. In contrast, we use the normal space (U+0020) a lot.
In my experience of Japanese documents, full width spaces are used with some regularity for formatting.
In translation (from Japanese), a frequent demand is to ensure that no full width characters remain in the target text - so being able to identify full width spaces visibly would be an advantage there.
We're replacing our use of the 'ux-advise' component with a keyword:
Component -> LibreOffice
Add Keyword: needsUXEval