Description: When using the normal SPACE character (U+0020), justification works correctly. For example: When a line (which is not the last of the justified paragraph) ends with a normal SPACE character, the whitespace from the last SPACE character is correctly discarded, all the whitespaces from the other SPACE characters are widened, and there is a line break. However, when using other whitespace characters, such as EN SPACE (U+2002) or EM SPACE (U+2003), if it happens that this other kind of whitespace is the last character on a justified line, the whitespace is not discarded and there is a “hole” on the end of the line. It happens like this: This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). When it should happen like this: This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). This is a sentence ending with a period followed by a EM SPACE (U+2003). There is a list of different whitespace characters at https://en.wikipedia.org/wiki/Template:Whitespace_(Unicode). I think those different whitespace characters deserve a special case of justification: -- All spaces that aren't wider than the normal SPACE (U+0020) or should have a fixed size (such as FIGURE SPACE and PUNCTUATION SPACE) shouldn't be widened on a justified paragraph, which means that: -- **Only** the EM SPACE and EN SPACE (and its equivalents EM QUAD and EN QUAD) should be widened **proportionally wider than normal spaces are widened** on justified paragraphs. Steps to Reproduce: 1. Format a paragraph as justified. 2. Type several sentences ending in periods. 3. After every period, instead of using a normal SPACE (U+0020) character, use another whitespace, such as EM SPACE (U+0023) or EN SPACE (U+0022). One easy way to encode then is typing its Unicode point number and pressing ALT+X right after it. Actual Results: If the non normal SPACE (U+0020) character is at the end of a justified line, it remains there as a “hole” on the text flow instead of being substituted by a new line. Also, LibreOffice doesn't widen those whitespace characters proportionally to their size in relation to the normal SPACE character. Expected Results: LibreOffice shouldn't allow the other whitespaces characters to be “shown” as a hole at the end of line. Also, LibreOffice should widen those whitespace characters proportionally to their size in relation to the normal SPACE character. Reproducible: Always User Profile Reset: No Additional Info: I attached an .ODT and a .PDF file as examples.
Created attachment 157585 [details] Sample document showing the error
Created attachment 157587 [details] Sample PDF document showing the error
@Khaled, is this all edit engine, or do Harfbuzz libs manage space redsitribution during justification? Are we losing, or maybe generalizing, the width of the em, en, quad spaces? And, assume strange things would happen in any case if working with a font that has missing coverage of spaces for the Unicode 'General Punctuation' block, so subject to vagaries of fallback handling.
(In reply to V Stuart Foote from comment #3) > @Khaled, is this all edit engine, or do Harfbuzz libs manage space > redsitribution during justification? Are we losing, or maybe generalizing, > the width of the em, en, quad spaces? > > And, assume strange things would happen in any case if working with a font > that has missing coverage of spaces for the Unicode 'General Punctuation' > block, so subject to vagaries of fallback handling. I'm no developer, just a power user and system administrator who can do some advanced scripting on PowerShell, but I could see at https://harfbuzz.github.io/what-harfbuzz-doesnt-do.html that: "HarfBuzz won't help you with line breaking, hyphenation, or justification. As mentioned above, HarfBuzz lays out the string along a single line of, notionally, infinite length. If you want to find out where the potential word, sentence and line break points are in your text, you could use the ICU library's break iterator functions."
Created attachment 157625 [details] Rendering of the sample ODT document on PDF X PDF X is a freeware file viewer which opens several formats beyond PDF. When using it to open the sample .ODT, it renders the justification correctly, even if it uses a draft font instead of the used font. PDF X can be installed on Windows 10 from https://www.microsoft.com/store/productId/9P3CP9G025RM
Created attachment 157626 [details] Rendering of the sample ODT document on ONLYOFFICE Desktop Editors Rendering of the sample ODT document on ONLYOFFICE Desktop Editors. ONLYOFFICE Desktop Editors is a FOSS which can edit OpenDocument Format files and Microsoft Office files. When using it to open the sample .ODT, it renders the justification correctly. ONLYOFFICE Desktop Editors can be downloaded at https://www.onlyoffice.com/pt/download-desktop.aspx for Linux, Windows and macOS.
Confirmed with attachment 157585 [details]. Note that you need Carlito font to be able to see the problem. This is also seen with older versions (4.4, 3.3), so nothing to do with HarfBuzz Version: 7.0.0.0.alpha0+ (x64) Build ID: 00db5933ded1884b2ac453552badae20fa943478 CPU threads: 4; OS: Windows 10.0 Build 18362; UI render: default; VCL: win; Locale: fi-FI (fi_FI); UI-Language: en-US Calc: threaded
(In reply to João Paulo from comment #0) > There is a list of different whitespace characters at > https://en.wikipedia.org/wiki/Template:Whitespace_(Unicode). > > I think those different whitespace characters deserve a special case of > justification: > > -- All spaces that aren't wider than the normal SPACE (U+0020) or should > have a fixed size (such as FIGURE SPACE and PUNCTUATION SPACE) shouldn't be > widened on a justified paragraph, which means that: > -- **Only** the EM SPACE and EN SPACE (and its equivalents EM QUAD and EN > QUAD) should be widened **proportionally wider than normal spaces are > widened** on justified paragraphs. > Sorry, I entered the wrong Wikipedia page address with a list of whitespace characters. The correct one is "https://en.wikipedia.org/wiki/Whitespace_character". Also, I don't think anymore that EM SPACE, EN SPACE, EM QUAD and EN QUAD should be widened proportionally wider than normal spaces are widened on justified paragraphs. Nor I do think that they shouldn't. I'll leave that opinion to people with more typography expertise than me. (Unless you want to add this choice of behavior to style formatting -- but using two or three NORMAL SPACES together on a justified paragraph should do the trick of making certain gaps wider than normal spaces).
Also in Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: a34dcd03254480927c403d904c0e754802d97b90 CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3 Locale: ro-RO (ro_RO.UTF-8); UI: en-US Calc: threaded