Bug 164140 - Justified Arabic / Persian text goes out of margin by typing some text
Summary: Justified Arabic / Persian text goes out of margin by typing some text
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
25.2.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Jonathan Clark
URL:
Whiteboard: target:25.8.0 target:25.2.2
Keywords: bibisected
Depends on:
Blocks: Kashida-Justification, Tatweel Word-Line-Break
  Show dependency treegraph
 
Reported: 2024-12-03 09:32 UTC by Hossein
Modified: 2025-03-06 14:27 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Justified Persian text (19.22 KB, application/vnd.oasis.opendocument.text)
2024-12-03 09:32 UTC, Hossein
Details
Example showing out of margin text (28.85 KB, image/png)
2024-12-03 10:43 UTC, Hossein
Details
PDF output showing the problem (17.68 KB, application/pdf)
2024-12-03 11:20 UTC, Hossein
Details
PDF showing problem in the output with LO 25.2 dev master (32.13 KB, application/pdf)
2024-12-05 17:03 UTC, Hossein
Details
Example showing out of margin text (13.75 KB, application/vnd.oasis.opendocument.text)
2024-12-09 11:30 UTC, Hossein
Details
Example showing out of margin text (PDF) (17.33 KB, application/pdf)
2024-12-09 11:43 UTC, Hossein
Details
crash log + back trace (37.73 KB, text/plain)
2025-03-01 21:53 UTC, Hossein
Details
ODT document showing out of margin text in page 2 (15.95 KB, application/vnd.oasis.opendocument.text)
2025-03-01 21:57 UTC, Hossein
Details
PDF document showing out of margin text in page 2 (22.19 KB, application/pdf)
2025-03-01 22:00 UTC, Hossein
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2024-12-03 09:32:49 UTC
Created attachment 197911 [details]
Justified Persian text

Description:
Typing characters in a justified Arabic/Persian text can lead to the text go out of margin, and ruin the text justification.


Steps to Reproduce:
1. Open the ODT attachment
2. Type some text (for example space) in the middle of the paragraph.
(You may use undo/redo to see the effects)


Actual Results:
The text goes out of margin

Expected Results:
The text should be inside the margin, and remain justified.

Reproducible: Always


User Profile Reset: No

Additional Info:
Version: 25.2.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 9a14a0fd8b4227b5d08b3154cddca46f82ec2a03
CPU threads: 12; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 1 Hossein 2024-12-03 10:43:34 UTC
Created attachment 197912 [details]
Example showing out of margin text

This example shows out of margin text. It happened after typing space, and then pressing Ctrl+z to undo.

Typing and undo/redoing it sometimes lead to much more obvious examples of out of margin text.

The problem IS visible in the PDF output as well as display on the screen, but it goes away after reloading the text.
Comment 2 Hossein 2024-12-03 11:20:01 UTC
Created attachment 197914 [details]
PDF output showing the problem

To clarify, I think the problem is visible in the output with:

LibreOffice 24.8.3.2 (X86_64) / LibreOffice Community

But with the latest LO 25.2 dev master, I don't see the problem in the PDF output, but only on the screen.

I can't be completely sure about the lack of problem in the output with the latest master, but this conclusion is from several tests that I have done. In the latest master, upon exporting to PDF, LibreOffice Writer lays out the text again, so that the problem is temporary and it goes away before generating the PDF output.
Comment 3 Aryeh 2024-12-04 23:11:00 UTC
When trying to replicate this, I noticed the characters at the margin moving slightly, but not as much as your example. So I would say I was unable to reproduce this.

Version: 24.8.3.2 (AARCH64) / LibreOffice Community
Build ID: 48a6bac9e7e268aeb4c3483fcf825c94556d9f92
CPU threads: 8; OS: macOS 14.5; UI render: Skia/Metal; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 4 Hossein 2024-12-05 17:03:16 UTC
Created attachment 197954 [details]
PDF showing problem in the output with LO 25.2 dev master

The attached file is a PDF output generated with the latest LO 25.2 dev master, which has the same issue.

The problem is created with some typing plus undo/redo. So, it takes some effort, and does not always happen by typing a few characters. Although I have seen such a problem only by typing, and without any undo/redo.

Version: 25.8.0.0.alpha0+ (AARCH64) / LibreOffice Community
Build ID: 139bb786bb4fe5cf2554f6016095ff1588f3994f
CPU threads: 10; OS: macOS 15.1.1; UI render: Skia/Metal; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Note that the problem goes away upon save and reload.
Comment 5 Hossein 2024-12-09 11:30:45 UTC
Created attachment 198019 [details]
Example showing out of margin text

With this example, reproducing the problem is much easier.

Steps to reproduce:
1. Open this attachment
2. Go to the start of the document (right side, top, as the paragraph is RTL).
3. Type "aaaa " (4 a and a space)

Actual Result:
Text on second line goes out of margin.

Expected Results:
Text should not go out of margin.

Reproducible with the latest LO 25.8 dev master:

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 8fdef548702ef240980b52e4076af36122534fed
CPU threads: 12; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Not reproducible with LO 24.8:
 
Version: 24.8.3.2 (X86_64) / LibreOffice Community
Build ID: 48a6bac9e7e268aeb4c3483fcf825c94556d9f92
CPU threads: 12; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

Therefore, this is a regression.
Comment 6 Hossein 2024-12-09 11:43:48 UTC
Created attachment 198020 [details]
Example showing out of margin text (PDF)

On Linux, upon requesting exporting to PDF, text is re-arranged and the PDF output is fine. But on macOS, it is not.

Please note that on macOS, you have to do undo in addition of typing "aaaa " to see the out of margin text.

Version: 25.8.0.0.alpha0+ (AARCH64) / LibreOffice Community
Build ID: ec0a49ecc7ea8449d90c1e69857d62728af19829
CPU threads: 10; OS: macOS 15.1.1; UI render: Skia/Metal; VCL: osx
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 7 Jonathan Clark 2025-02-19 13:40:52 UTC
Confirmed. Following the instructions in comment 5, I was able to reproduce the second-line overflow. 

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: ce4e8a5686933511070722db2215784aceef98ce
CPU threads: 32; OS: Linux 6.11; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 8 Eyal Rozenberg 2025-02-19 21:28:58 UTC
Ouch... could this be a regression?
Comment 9 Saburo 2025-02-20 08:49:17 UTC
bibisected with linux-64-25.2
commit 41671d732ad933175779b61159b56824ff77b2fe
author	Jonathan Clark <jonathan@libreoffice.org>

tdf#163105 sw: Add some whitespace expansion to kashida justification

Tihs change updates Writer kashida justification to include some
whitespace expansion, mirroring the behavior of Edit Engine and other
word processor programs.

Each kashida and space character are given 1 unit each of extra space.

Change-Id: I8c9031a0d51844e532b9d1f7e3619c2c9ba23f6d
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/173884
Comment 10 Jonathan Clark 2025-02-20 17:21:23 UTC
(In reply to Eyal Rozenberg from comment #8)
> Ouch... could this be a regression?

It's complicated.

During layout, Writer marks certain lines and positions as not permitting kashida justification. These exclusions can happen for several reasons, such as the line not having enough extra space to permit even a single extra tatweel glyph. These positions and lines are stored as indices and ranges inside an SwScriptInfo instance, and consulted during layout and rendering.

The root cause for this bug is Writer failing to correctly update this data during editing. The referenced commit didn't cause this problem, but it does make the problem more obvious by making it more likely that lines will be marked as not permitting kashida justification.

Here's how that works:

To illustrate better, I made a 4-line version of the reproducer by copying and pasting the middle line at the end of the last line.

1.) During initial layout, Writer marks the lines (0, 88),(88,180),(180,269) as no-kashida lines.

2.) Type "a" at the start of the first line. The first line is now an extra character long. Writer clears this new line "(0,89)" as no-kashida, confirms there still isn't enough room to do kashida justification, and then sets line (0,89) as no-kashida. Now it thinks the no-kashida lines are (0,88) and (180,269). The second line has been removed from the list, and the third line is... not a line. The indices are off by 1.

3.) Type 3 more "a"s, per the instructions. After the final "a", a word is wrapped to the next line. This makes Writer update all of the kashida data correctly. Now the no-kashida lines are (0,88),(88,176),(176,269), as expected.

4.) Type a space. Writer wipes out and re-adds line "(0,89)" again. After this, the no-kashida lines are (0,89) and (176,269). The second line is missing, and the entry for the third line is again off-by-one. The second line overflows the margin because Writer tries to draw it with tatweel glyphs, despite not having enough room for them.

Any edit that doesn't reflow the paragraph will corrupt these indices.
Comment 11 Eyal Rozenberg 2025-02-20 22:11:46 UTC
(In reply to Jonathan Clark from comment #10)

And does this affect only Tatweel, or could it affect justification more generally?
Comment 12 Jonathan Clark 2025-02-20 23:00:42 UTC
(In reply to Eyal Rozenberg from comment #11)
> (In reply to Jonathan Clark from comment #10)
> 
> And does this affect only Tatweel, or could it affect justification more
> generally?

The above specifically applies to kashida justification. It's possible there are other bugs in the same class, but I haven't checked.
Comment 13 Eyal Rozenberg 2025-02-21 12:35:30 UTC
(In reply to Jonathan Clark from comment #12)
> The above specifically applies to kashida justification. It's possible there
> are other bugs in the same class, but I haven't checked.

I am wondering if there is a relation to bug 164470. The symptoms, especially with the modified reproducer, seem oddly similar.
Comment 14 Commit Notification 2025-02-27 14:38:51 UTC
Jonathan Clark committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/fac5695974d8a2197edba1e9f69f86621196cae1

tdf#164140 sw: Fix invalid string indices in kashida justification

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Hossein 2025-03-01 21:53:56 UTC
Created attachment 199549 [details]
crash log + back trace

@Jonathan:
Thanks for providing this fix. It improves the situation a lot, and I no longer see text going out of margin in a single page.

On the other hand:

1. LibreOffice crashes for me by typing a lot of text. The attachment is the log and the stack trace which I get.

2. The same issue happens in a paragraph, long enough to span across pages. Please see the next attachment, that I will attach shortly.

3. It might be related that, if you create a big paragraph with a very long word (for example type a few hundreds of "a" together) LibreOffice tend to hang for a seconds or so, after typing each character, which is undesirable. This should be also visible in the next attachment.

I have tested with the latest LO 25.8 dev master from today:

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: dec9f7d5b2d72e83f4feb81bc8845bca506bbe20
CPU threads: 12; OS: Linux 6.2; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: CL threaded
Comment 16 Hossein 2025-03-01 21:57:25 UTC
Created attachment 199550 [details]
ODT document showing out of margin text in page 2

Please look into the 3rd line from the bottom to see that the text obviously goes out of margin on the left.
Comment 17 Hossein 2025-03-01 22:00:24 UTC
Created attachment 199551 [details]
PDF document showing out of margin text in page 2

Please look into the second page of the PDF document to see that the text goes out of the margin.
Note that the text is not exactly justified in other lines, which is another (related) issue.
Comment 18 Hossein 2025-03-06 14:09:41 UTC
Thanks Jonathan,

As far as I have tested, the above crash and also out of margin text, is now fixed in your subsequent patch:

tdf#165540 sw: Fix kashida insertion position data corruption ef31a26abc5ff4851be5360fd8ecd325132c5592
Comment 19 Commit Notification 2025-03-06 14:27:54 UTC
Jonathan Clark committed a patch related to this issue.
It has been pushed to "libreoffice-25-2":

https://git.libreoffice.org/core/commit/039ff3ff709e8a91d11317dd2170ecae075a0598

tdf#164140 sw: Fix invalid string indices in kashida justification

It will be available in 25.2.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.