Bug 93033 - FORMATTING: Expanded spacing on zero-width characters
Summary: FORMATTING: Expanded spacing on zero-width characters
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Font-Rendering
  Show dependency treegraph
 
Reported: 2015-07-30 16:19 UTC by j_mach_wust
Modified: 2023-05-10 15:16 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
test document with some zero-width characters (12.72 KB, application/vnd.oasis.opendocument.text)
2015-07-30 16:19 UTC, j_mach_wust
Details
Test document display (350.56 KB, image/png)
2015-07-30 18:17 UTC, j_mach_wust
Details
screenshot of document, print preview, and character preview (28.97 KB, image/png)
2015-07-31 20:33 UTC, Gordo
Details
Better test document: Continuing text flow shows increased spacing after the affected words (13.52 KB, application/vnd.oasis.opendocument.text)
2016-11-26 11:55 UTC, j_mach_wust
Details
Test document on Ubuntu Linux (LibreOfficeDev 5.3.0.0.beta1) (108.42 KB, image/png)
2016-11-26 12:05 UTC, j_mach_wust
Details
Test document on MacOS (LibreOfficeDev 5.3.0.0.beta1) (99.09 KB, image/png)
2016-11-26 12:07 UTC, j_mach_wust
Details
Test document on Windows (LibreOfficeDev 5.3.0.0.beta1) (67.11 KB, image/png)
2016-11-26 12:09 UTC, j_mach_wust
Details
the paragraphs with "llflll"; first and second should look identical (12.69 KB, application/vnd.oasis.opendocument.text)
2017-12-17 09:37 UTC, j_mach_wust
Details
Screenshot of the paragraphs with "llflll"; first and second do not look identical (730.00 KB, image/png)
2017-12-17 09:39 UTC, j_mach_wust
Details
Document with LibreOffice 7.5.3 on macOS (616.17 KB, image/png)
2023-05-10 14:17 UTC, ⁨خالد حسني⁩
Details

Note You need to log in before you can comment on or make changes to this bug.
Description j_mach_wust 2015-07-30 16:19:08 UTC
Created attachment 117537 [details]
test document with some zero-width characters

Steps to reproduce:

1. Input text with zero-width characters.

2. Expand spacing (Format → Character… → Position → Spacing).

3. Observe how zero-width characters cause additional spacing which they really shouldn’t (they’re supposed to be zero-width).

I have tested this with today’s 4.4.5.2 and 5.0.0.4 builds on Windows (8.1), Mac (10.10.4) and Linux (Lubuntu 15.04). None of these get it right.

As a rule, there is some additional spacing with Zero-Width (Non-)Joiner and with zero-width accents, whereas zero-width spaces behave OK (except on 5.0.0.4 on Mac OS X). However, the details differ from OS to OS. On Windows, which does not do any ligatures, the additional space is not where the Zero-Width (Non-)Joiner logically should be (according to input character sequence), but after the following character. On Mac and on Linux, which do ligatures, and on Windows zero-width accents, the additional space is where it logically should be.

A strange behavior can be observed in the character dialog on Windows and on Mac: The preview of the selected word shows flawless spacing (without additional space), whereas in the actual document, the zero-width characters will cause additional spacing. Somehow, the preview of the dialog gets it right, but the document doesn’t. On Linux, the preview is equally wrong as the document.

Other free applications that get the spacing right (without additional spacing) include Firefox or LaTeX.

I am attaching a test document with some zero-width characters.
Comment 1 j_mach_wust 2015-07-30 18:17:17 UTC
Created attachment 117542 [details]
Test document display

Test document display on 4.4.5.2 and 5.0.0.4 builds on Windows (8.1), Mac (10.10.4) and Linux (Lubuntu 15.04).
Comment 2 Gordo 2015-07-31 20:33:48 UTC
Created attachment 117568 [details]
screenshot of document, print preview, and character preview

I can reproduce.

The screenshot shows, top to bottom, document, print preview, and character preview.

Windows Vista 64
Version: 4.4.5.2
Build ID: a22f674fd25a3b6f45bdebf25400ed2adff0ff99
Comment 3 QA Administrators 2016-09-20 10:21:35 UTC Comment hidden (obsolete)
Comment 4 j_mach_wust 2016-11-26 11:55:12 UTC
Created attachment 129026 [details]
Better test document: Continuing text flow shows increased spacing after the affected words

This is an improved test document. The test words are repeated. The continuing text flow shows how some words get the wrong increased spacing after the word.
Comment 5 j_mach_wust 2016-11-26 12:05:08 UTC
Created attachment 129027 [details]
Test document on Ubuntu Linux (LibreOfficeDev 5.3.0.0.beta1)

This is how the test document displays on Ubuntu Linux 16.10, using the latest
Beta LibreOfficeDev_5.3.0.0.beta1_Linux_x86-64. The font used in the examples is
Linux Libertine 5.3.0 (LinLibertineTTF_5.3.0_2012_07_02.tgz).

The characters fall into three groups:
1. Zero-width non-joiner and Zero-width non-break space and the combining breve
   below cause additional space at the character's position.
2. Zero-width joiner and Combining acute accent cause additional space AFTER the
   word.
3. Zero-width space and Word joiner display just fine.

I guess the difference between groups 1 and 2 is that the characters in group 2
trigger a built-in smartfont behaviour in the font, while the characters in
group 1 don't.
Comment 6 j_mach_wust 2016-11-26 12:07:25 UTC
Created attachment 129028 [details]
Test document on MacOS (LibreOfficeDev 5.3.0.0.beta1)

This is how the test document displays on MacOS 10.12.1, using the latest
Beta LibreOfficeDev_5.3.0.0.beta1_MacOS_x86-64. The font used in the examples is
Linux Libertine 5.3.0 (LinLibertineTTF_5.3.0_2012_07_02.tgz).

The characters fall into three groups:
1. Zero-width non-joiner and Zero-width non-break space and the combining breve
   below cause additional space at the character's position.
2. Zero-width joiner and Combining acute accent cause additional space AFTER the
   word.
3. Zero-width space and Word joiner display just fine.

I guess the difference between groups 1 and 2 is that the characters in group 2
trigger a built-in smartfont behaviour in the font, while the characters in
group 1 don't.
Comment 7 j_mach_wust 2016-11-26 12:09:01 UTC
Created attachment 129029 [details]
Test document on Windows (LibreOfficeDev 5.3.0.0.beta1)

This is how the test document displays on Windows 10.0.1, using the latest
Beta LibreOfficeDev_5.3.0.0.beta1_Win_x64. The font used in the examples is
Linux Libertine 5.3.0 (LinLibertineTTF_5.3.0_2012_07_02.tgz).

The characters fall into three groups:
1. Zero-width non-joiner and Zero-width non-break space and the combining breve
   below cause additional space at the character's position.
2. Zero-width joiner and Combining acute accent cause additional space AFTER the
   word.
3. Zero-width space and Word joiner display just fine.

I guess the difference between groups 1 and 2 is that the characters in group 2
trigger a built-in smartfont behaviour in the font, while the characters in
group 1 don't.
Comment 8 j_mach_wust 2016-11-26 12:14:48 UTC
I have retested for the bug.

The bug is still present on a currently supported version of LibreOffice (LibreOffice 5.2.2.3, MacOS 10.12.1).

I have also tested LibreOffice 3.3.0. The bug is also present with 3.3, so I am setting the version to "inherited from OOo".

The upcoming 5.3 release, while still suffering from the bug, brings an important change: Now the display is identically wrong accross different OS. I have added screenshots from LibreOfficeDev 5.3.0.0.beta1 on different OS (MacOS 10.12.1, Ubuntu 16.10, Windows 10.0.1).
Comment 9 QA Administrators 2017-11-27 09:59:16 UTC Comment hidden (obsolete)
Comment 10 j_mach_wust 2017-12-17 09:35:53 UTC
I have retested for the bug.

The bug is still present on a currently supported version of LibreOffice (LibreOffice 5.4.3.2, MacOS 10.13.2 or Ubuntu 17.10).

I am adding another example file. It contains three increased spacing paragraphs with the nonsense word "llflll". The first two are supposed to look identical, but they do not:

1. The first paragraph has ligatures switched off by setting "liga=0" (using the font name "Linux Libertine:liga=0").

2. The second paragraph has ligatures switched off by using the zero-width non-joiner (between the "f" and the "l"). It is supposed to look exactly like the first paragraph, but it does not.

3. The third paragraph uses the default "fl" ligature.
Comment 11 j_mach_wust 2017-12-17 09:37:57 UTC
Created attachment 138481 [details]
the paragraphs with "llflll"; first and second should look identical
Comment 12 j_mach_wust 2017-12-17 09:39:01 UTC
Created attachment 138482 [details]
Screenshot of the paragraphs with "llflll"; first and second do not look identical
Comment 13 QA Administrators 2018-12-18 03:42:44 UTC Comment hidden (obsolete)
Comment 14 j_mach_wust 2019-05-02 18:14:06 UTC Comment hidden (obsolete)
Comment 15 QA Administrators 2021-05-02 03:50:22 UTC Comment hidden (obsolete)
Comment 16 j_mach_wust 2021-05-07 06:58:05 UTC Comment hidden (obsolete)
Comment 17 QA Administrators 2023-05-08 03:18:11 UTC Comment hidden (obsolete)
Comment 18 ⁨خالد حسني⁩ 2023-05-10 14:17:16 UTC
Created attachment 187188 [details]
Document with LibreOffice 7.5.3 on macOS

Current status:
- Ligatures are disabled automatically when letter spacing is used (fixed in bug 66819).
- Accents are no longer displaced and don’t cause double spacing.
- All zero space characters don’t cause extra spacing except Zero width no break space (U+FEFF), I think this is the only remaining issue.

This is testing with:
Version: 7.5.3.2 (X86_64) / LibreOffice Community
Build ID: 9f56dff12ba03b9acd7730a5a481eea045e468f3
CPU threads: 6; OS: Mac OS X 13.3.1; UI render: default; VCL: osx
Locale: en-EG (en_EG.UTF-8); UI: en-US
Calc: threaded
Comment 19 ⁨خالد حسني⁩ 2023-05-10 14:49:03 UTC
(In reply to خالد حسني from comment #18)
> except Zero width no break space (U+FEFF), I think this is the only remaining issue.

Checking more, it seems that the Unicode Text Segmentation algorithm (https://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries) treats U+FEFF as a single grapheme cluster i.e. a user-perceived character, so when letter-spacing we count it as one character and space it.

This can be tested in https://util.unicode.org/UnicodeJsps/breaks.jsp, by copying the word deprecated from the document and choosing User Character from the drop down menu, and there will be two red lines where U+FEFF is.

Given this and since the use of U+FEFF as zero width no break space is deprecated, I think there are no issues left here and this bug should be closed.
Comment 20 BogdanB 2023-05-10 15:16:58 UTC
For the reporter:
Please be free to reopen this bug if you can reproduce it with a newer version. LibreOffice 7.5 or LibreOffice 7.6