Bug 150286 - Wrong justification of Persian/Arabic text
Summary: Wrong justification of Persian/Arabic text
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.4.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: RTL-CTL Kashida-Justification
  Show dependency treegraph
 
Reported: 2022-08-06 13:04 UTC by Hossein
Modified: 2023-05-24 08:16 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Justified Persian/Arabic text (29.06 KB, application/vnd.oasis.opendocument.text)
2022-08-06 13:04 UTC, Hossein
Details
Correct justified text with LO 4.4 (59.98 KB, application/pdf)
2022-08-06 13:06 UTC, Hossein
Details
Incorrect justified text with LO 7.5 dev master (47.29 KB, application/pdf)
2022-08-06 13:08 UTC, Hossein
Details
Vazirmatn font (12.44 MB, application/zip)
2022-08-06 13:14 UTC, Hossein
Details
PDF Output after removing all footnotes (19.44 KB, application/pdf)
2022-08-31 16:58 UTC, Hossein
Details
Correct output after forcing re-layout of the text (46.21 KB, application/pdf)
2022-09-03 18:09 UTC, Hossein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2022-08-06 13:04:49 UTC
Created attachment 181639 [details]
Justified Persian/Arabic text

Description:

As described (in Persian) below, the user could use the justified text until LibreOffice 4.4, but after that the output became incorrect:

Problem with justified text in LibreOffice:
https://forum.ubuntu.ir/index.php?topic=155184.0

Steps to Reproduce:
1. Open the attachment

Actual Results:
Text is not justified in page 1

Expected Results:
Text should be justified in page 1


Reproducible: Always


User Profile Reset: No


Additional Info:

Version: 7.5.0.0.alpha0+ / LibreOffice Community
Build ID: 56145f237b63a35c142dbccd434fd780badcf489
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: x11
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Hossein 2022-08-06 13:06:03 UTC
Created attachment 181640 [details]
Correct justified text with LO 4.4
Comment 2 Hossein 2022-08-06 13:08:38 UTC
Created attachment 181641 [details]
Incorrect justified text with LO 7.5 dev master
Comment 3 Hossein 2022-08-06 13:14:03 UTC
Created attachment 181642 [details]
Vazirmatn font

The latest version of "Vazirmatn" font, which is currently 33.003, can be downloaded from:

https://github.com/rastikerdar/vazirmatn/releases
Comment 4 Hossein 2022-08-06 13:15:28 UTC
This is a regression, bibisected to:

commit f0393d7ff69011a16b100541ef18e5090544e4a1
Author: Khaled Hosny <khaledhosny@eglug.org>
Date:   Mon May 6 16:54:53 2013 +0200

    [harfbuzz] Fix text width calculation, 3rd try
    
    It turns out storing the width in the layout is not so good idea,
    because in some mysterious cases when font fallback is involved we call
    GetTextWidth() without calling LayoutText() first, and we return the
    width of the previous text run.
    
    It seems all I needed is to pass down the X offset with the glyph item,
    and take it into account when calculating the width.
Comment 5 ⁨خالد حسني⁩ 2022-08-11 17:52:54 UTC
I’m skeptical that the mentioned commit is the source of this. The formatting seems to be unstable, changing the font fixes the issue, but it shows up again after closing and opening the file.

It seems to be related to the tab stops and/or the footnote numbers, which is higher level that the code this commit touches.
Comment 6 ⁨خالد حسني⁩ 2022-08-11 18:30:59 UTC
This seems to be similar to (if not a duplicate of) bug 138199.
Comment 7 Hossein 2022-08-31 16:56:54 UTC
(In reply to خالد حسني from comment #5)
> I’m skeptical that the mentioned commit is the source of this. The
> formatting seems to be unstable, changing the font fixes the issue, but it
> shows up again after closing and opening the file.
> 
> It seems to be related to the tab stops and/or the footnote numbers, which
> is higher level that the code this commit touches.
You're right. As the other problem with justified Arabic text (bug 146713), It was off by one, and the responsible commit was setting HarfBuzz as the default for the text rendering engine.
Comment 8 Hossein 2022-08-31 16:58:02 UTC
Created attachment 182120 [details]
PDF Output after removing all footnotes

As visible in the output, right after removing all the footnotes, the text goes out of the margin.
Comment 9 ⁨خالد حسني⁩ 2022-09-01 16:53:10 UTC
This happens with no justification at all.
Comment 10 ⁨خالد حسني⁩ 2022-09-02 11:28:16 UTC
I spent some time debugging this, but went no where. I have no idea what is going on here or even where to start.
Comment 11 Hossein 2022-09-03 18:06:36 UTC
(In reply to خالد حسني from comment #10)
> I spent some time debugging this, but went no where. I have no idea what is
> going on here or even where to start.
We can call this a glitch, because upon requesting re-layout of the text (or even the paragraph), the problem goes away.

For example, set Persian language for all text, change the numeral style, or use any other means to force LibreOffice re-layout the text. Then, the problem goes away, even in the output. But after save and reload, the problem is still there.

Something is creating invalid data in the text data structures. But, how can we dump that data, and compare it to the situation where the problem is fixed temporarily? Even pressing ctrl+l, and then ctrl+j causes the paragraph to be displayed correctly.
Comment 12 Hossein 2022-09-03 18:09:30 UTC
Created attachment 182191 [details]
Correct output after forcing re-layout of the text

I just set the Persian language for the whole text, and then created the output. Please note that setting the language seems to be unrelated to the bug itself, because you can use any other means of forcing LibreOffice re-layout the text to achieve the correct output.
Comment 13 ⁨خالد حسني⁩ 2022-09-03 21:08:07 UTC
(In reply to Hossein from comment #12)
> Created attachment 182191 [details]
> Correct output after forcing re-layout of the text
> 
> I just set the Persian language for the whole text, and then created the
> output. Please note that setting the language seems to be unrelated to the
> bug itself, because you can use any other means of forcing LibreOffice
> re-layout the text to achieve the correct output.

Did you try to re-open the file, for me re-opening the file shows the issue again.
Comment 14 Hossein 2022-09-03 23:05:31 UTC
(In reply to خالد حسني from comment #13)
> (In reply to Hossein from comment #12)
> > Created attachment 182191 [details]
> > Correct output after forcing re-layout of the text
> > 
> > I just set the Persian language for the whole text, and then created the
> > output. Please note that setting the language seems to be unrelated to the
> > bug itself, because you can use any other means of forcing LibreOffice
> > re-layout the text to achieve the correct output.
> 
> Did you try to re-open the file, for me re-opening the file shows the issue
> again.

Yes, this is true. The changes are temporary. I think we need to dump and compare the data structures for one of the paragraphs. I wish this could be possible using the UNO object inspector tool, but it is not (bug 142373)