Bug 155707 - DOCX export/import and basic layout interoperability of w:jc lowKashida/mediumKashida/highKashida (It was: Implement justified/justify low/medium/high paragraph alignments)
Summary: DOCX export/import and basic layout interoperability of w:jc lowKashida/mediu...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:25.8.0
Keywords:
: 155689 (view as bug list)
Depends on:
Blocks: DOCX-Paragraph Paragraph-Alignment Kashida-Justification, Tatweel
  Show dependency treegraph
 
Reported: 2023-06-06 12:42 UTC by Hossein
Modified: 2025-05-14 14:20 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
DOCX different justify options (12.71 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-06-07 14:45 UTC, Hossein
Details
PDF output using MS Word (74.17 KB, application/pdf)
2023-06-07 14:46 UTC, Hossein
Details
PDF output using LibreOffice 7.6 dev master (45.77 KB, application/pdf)
2023-06-07 14:46 UTC, Hossein
Details
PDF output using LO 25.8 (37.82 KB, application/pdf)
2025-05-13 15:17 UTC, László Németh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hossein 2023-06-06 12:42:01 UTC
Description:
In Microsoft Word, there are 4 options for justified paragraph alignments:
* justified
* justify low
* justify medium
* justify high

In LibreOffice, there is only one, which seems to be similar to justify low. In the above options, no kashida is used for the first option.

Each of the above options are used for a certain purpose, so having all of them is useful to create professional looking documents.
Comment 1 Khaled Hosny 2023-06-06 13:59:31 UTC
Apart from the code side, this would need support from ODF format as well.
Comment 2 Telesto 2023-06-07 02:21:20 UTC
Adding Regina 

(In reply to ⁨خالد حسني⁩ from comment #1)
> Apart from the code side, this would need support from ODF format as well.
Comment 3 Regina Henschel 2023-06-07 13:41:39 UTC
Please provide an docx file with such text.
Comment 4 Hossein 2023-06-07 14:45:51 UTC
Created attachment 187767 [details]
DOCX different justify options

The attachment is a DOCX file containing RTL and LTR paragraphs with different justify options.
Comment 5 Hossein 2023-06-07 14:46:22 UTC
Created attachment 187768 [details]
PDF output using MS Word
Comment 6 Hossein 2023-06-07 14:46:57 UTC
Created attachment 187769 [details]
PDF output using LibreOffice 7.6 dev master
Comment 7 Khaled Hosny 2023-06-07 15:01:26 UTC
From the attached documents, what LibreOffice implements right now is a mixture of “justified” and “justify low”, but we chose automatically depending on the font. It shouldn’t be hard to allow for explicit selection of the mode as well.

The other two modes use different line breaks, so it is an interoperability issue, but to be interoperable we also need to know what algorithm is used to to insert Kashidas. From a quick glance, the “justify medium” seems to be inserting one kashida in each work then justifying the text as usual, and “justify high” seems to be inserting three Kashidas then justifying, but this is only a guess.

What happens with font like Amiri and Noto Nastaliq where inserting Kashida anywhere will sometimes break how the glyphs join?
Comment 8 Hossein 2023-10-19 08:32:17 UTC
(In reply to ⁨خالد حسني⁩ from comment #7)
> What happens with font like Amiri and Noto Nastaliq where inserting Kashida
> anywhere will sometimes break how the glyphs join?
In MS Word, the low/medium/high justification break the glyph join with some calligraphy fonts like Nastaliq, and you have to stick to "justified".
Comment 9 Commit Notification 2025-05-13 07:46:19 UTC
László Németh committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/7d4ad3fa74674fff06f52eec0abc8dc7691a89b4

tdf#155707 sw DOCX: fix Kashida justification import/export/layout

It will be available in 25.8.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 László Németh 2025-05-13 15:17:27 UTC
Created attachment 200782 [details]
PDF output using LO 25.8
Comment 11 Hossein 2025-05-13 15:42:15 UTC
*** Bug 155689 has been marked as a duplicate of this bug. ***
Comment 12 László Németh 2025-05-13 15:43:29 UTC
It's possible to break the original report into several issues. Because one of the main issues, the lost DOCX export/import interoperability was fixed now (also the partial layout interoperability), I close this issue, improving also its description, opening the way to focus on the remaining issues separately.

The solved and remaining issues:

1) DOCX export/import interoperability;

2) DOCX layout interoperability;

3) user interface/functionality.

4) user interface/functionality limited for interoperability.

5) Character formatting/single word usage

Notes

1) Commit https://git.libreoffice.org/core/+/7d4ad3fa74674fff06f52eec0abc8dc7691a89b4%5E%21 fixed 1) completely.

2) That commit fixed also 2) partially: there are different length kashidas, as requested by the layout, but word spacing is still bigger at mediumKashida and highKashida, than in MS Word, so the layout is not completely the same. It's worth to file a new bug for it.

3) Word spacing patches of Bug 126154 fixed also 3): setting bigger word spacing in Adjustment pane of the paragraph settings results also bigger Kashidas in the paragraphs of a document created in Writer.

4) This means similar dialog windows options, like in MS Word, to create a DOCX file in Writer, which allows to export DOCX with low/medium/highKashida more obviously, than recently (setting 133%, 200% and 300% custom word spacing (minimum/desired/maximum values are equal).

5) According to the Wikipedia, Kashida is a word-level typographic tool, e.g. for emphasized word or words. Maybe OpenType fonts have already supported it, but it's worth to simplify its usage, if needed.
Comment 13 László Németh 2025-05-13 15:52:20 UTC
(In reply to Hossein from comment #11)
> *** Bug 155689 has been marked as a duplicate of this bug. ***

@Hossein: Ah, I haven't noticed, that there was a report about only the DOCX interoperability. Thanks for closing it! 

By the way, it seems for me, that DOCX lowKashida/mediumKashida/highKashida justification is a poor man's style version of the OpenType Kashida formatting, because the last paragraph line doesn't use different kashidas in Word. Because the recent word spacing feature of Writer does the same by accident, this is OK for interoperability (desired word spacing is 100% normally, because most fonts contain desired width spaces).
Comment 14 Hossein 2025-05-13 20:09:44 UTC
Thanks László for putting time into fixing this issue!

For the sub-issues you have listed:

1. DOCX export/import interoperability: I haven't tested export yet, but import looks fine. As the code implies, exact value of 133, 200 and 300 is compared for deciding in the export, like rAdjust.GetPropWordSpacingMaximum() == 133. Do you think some fuzzy comparison can be used instead?

2. DOCX layout interoperability: Layout looks fine for Arabic/Persian text (page 2 of the attachment), but as you may see on page 1 of the attachment, there is some difference between the rendering of Latin text in Word and LibreOffice. At least in justified medium/high, paragraph does not fill the whole area in Word. But in LibreOffice it does, after your new patch.

This is some web page that discusses justified medium/high in Word:
https://addbalance.com/usersguide/justification.htm
Please look into this example:
https://addbalance.com/usersguide/images/justif2.gif

The author of the above page says:
"I do not pretend to know the rationale for Justify Medium and Justify High. Neither seems to be truly justified but give something closer to justification with a ragged right edge."

The above picture clarifies the correctness of what can be seen in attachment 187768 [details] (page 1), the fact that paragraph text does not fill the whole area. Do you have any idea why this happens?

Other than that, as you have mentioned, I will file a bug report around matching the kashida and spacing between LibreOffice and Word.

3. user interface/functionality: I see some differences in the rendering results of LibreOffice and Word. Do you think it is possible to achieve somehow similar results in LibreOffice by modifying spacing values?

4. user interface/functionality limited for interoperability: I agree that we need some similar UI options. It should be both in the dialog and also toolbars. There is a separate option (distributed) for CJK in Word, which will is visible only when CJK is enabled, as it is only relevant for CJK. This also discussed in the above link. I think that is filed under tdf#154881:

Bug 154881: I would like to have a thai distributed feature for handling Thai text
https://bugs.documentfoundation.org/show_bug.cgi?id=154881

5. Character formatting/single word usage: I should add what Khaled mentioned in comment 7. Nastaliq family of fonts, use custom use of kashida, which means different glyphs/ligatures in case of needing extended size. Then, the output can be different. I need to test these cases. This is an example Nastaliq font:
https://cdn.irannastaliq.ir/2022/05/IranNastaliq-V2.zip
Comment 15 Hossein 2025-05-14 04:26:38 UTC
Looking into these files:

offapi/com/sun/star/style/ParagraphAdjust.idl
oovbaapi/ooo/vba/word/WdParagraphAlignment.idl

I should say another steps, adding support for these alignment options in UNO API, BASIC and VBA compatibility, is also needed.
Comment 16 László Németh 2025-05-14 14:20:45 UTC
(In reply to Hossein from comment #14)
> Thanks László for putting time into fixing this issue!

Thanks for reporting the problem! I've seen it before, that your recent comment helped a lot to re-check and solve it!

> 
> For the sub-issues you have listed:
> 
> 1. DOCX export/import interoperability: I haven't tested export yet, but
> import looks fine. As the code implies, exact value of 133, 200 and 300 is
> compared for deciding in the export, like
> rAdjust.GetPropWordSpacingMaximum() == 133. Do you think some fuzzy
> comparison can be used instead?

We need to test MS Word with fonts with different length U+0640 ARABIC TATWEEL to analyze the algorithm: absolute width, or relative width to the space or to the U+0640. It's an interesting question, because DTP/recent Writer uses word spacing relative to space character, but XSL/CSS standards prefer absolute width for word spacing (allowing maybe relative, too).

The value 133% came from DTP default (as the upper limit for normal space), 200 and 300 from 2, 3 spaces (hoping that space and kashida characters are equal long, and using Khaled's hint and a quick visual comparison of the result).


> 
> 2. DOCX layout interoperability: Layout looks fine for Arabic/Persian text
> (page 2 of the attachment), but as you may see on page 1 of the attachment,
> there is some difference between the rendering of Latin text in Word and
> LibreOffice. At least in justified medium/high, paragraph does not fill the
> whole area in Word. But in LibreOffice it does, after your new patch.
> 
> This is some web page that discusses justified medium/high in Word:
> https://addbalance.com/usersguide/justification.htm
> Please look into this example:
> https://addbalance.com/usersguide/images/justif2.gif
> 
> The author of the above page says:
> "I do not pretend to know the rationale for Justify Medium and Justify High.
> Neither seems to be truly justified but give something closer to
> justification with a ragged right edge."
> 
> The above picture clarifies the correctness of what can be seen in
> attachment 187768 [details] (page 1), the fact that paragraph text does not
> fill the whole area. Do you have any idea why this happens?

I guess, this is a feature developed only for Arabic/Persian, because the standard mentions only this for lowKashida/mediumKashisa/highKashida.


> 
> Other than that, as you have mentioned, I will file a bug report around
> matching the kashida and spacing between LibreOffice and Word.

Thanks in advance!

> 
> 3. user interface/functionality: I see some differences in the rendering
> results of LibreOffice and Word. Do you think it is possible to achieve
> somehow similar results in LibreOffice by modifying spacing values?

If we need only the same page breaks, i.e. same length paragraphs, likely it's enough to analyze the algorithm, and set the spacing values (also grab-bagging, extending the OOXML kashida setting, if needed, i.e. when the MS Word kashida justification algorithm uses not space-relative word spacing).

For the same paragraph layout, it seems, it must modify the ratio(?) of the kashida and word spacing. I haven't touched the kashida code, because drawing kashida is fully automatic now, i.e. my patch set only earlier line breaks, resulting bigger word spacing, nothing more. Longer spaces and kashidas are calculated automatically based on the position of the line break. MS Word uses the same narrow word spacing with the longer kashidas, but I'm not sure, what it's the best for the Arabic/Persian calligraphy. MS Word is not a desktop publishing tool, so it's worth to fix its approach, if needed.

> 
> 4. user interface/functionality limited for interoperability: I agree that
> we need some similar UI options. It should be both in the dialog and also
> toolbars. There is a separate option (distributed) for CJK in Word, which
> will is visible only when CJK is enabled, as it is only relevant for CJK.
> This also discussed in the above link. I think that is filed under
> tdf#154881:
> 
> Bug 154881: I would like to have a thai distributed feature for handling
> Thai text
> https://bugs.documentfoundation.org/show_bug.cgi?id=154881

Jonathan's screenshot show Thai distributed justification, so maybe it's enough to fix the missing import.

> 
> 5. Character formatting/single word usage: I should add what Khaled
> mentioned in comment 7. Nastaliq family of fonts, use custom use of kashida,
> which means different glyphs/ligatures in case of needing extended size.
> Then, the output can be different. I need to test these cases. This is an
> example Nastaliq font:
> https://cdn.irannastaliq.ir/2022/05/IranNastaliq-V2.zip

Thanks for testing in advance!