Bug 158055 - Order of multiple English words separated by Persian "،" separator in a Persian paragraph is not as expected!
Summary: Order of multiple English words separated by Persian "،" separator in a Persi...
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium trivial
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: RTL-CTL
  Show dependency treegraph
 
Reported: 2023-11-03 19:03 UTC by Reza
Modified: 2023-11-05 12:36 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Three sample sentence describing different situations. (2.77 MB, application/vnd.oasis.opendocument.text)
2023-11-03 19:07 UTC, Reza
Details
The PDF format of previous attachment. (37.45 KB, application/pdf)
2023-11-03 19:07 UTC, Reza
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Reza 2023-11-03 19:03:53 UTC
Description:
Hi, 

I'm a Persian native speaker and a English to Persian translator. 

As you know Persian language is a RTL language. When I want to write multiple English words in a Persian sentence that these English words separated by "،" separator, the order of English words is not as expected. 

If I use "؛" Persian separator instead of "،", the result is as expected. 

Someone suggested using "Right-To-Left Mark" character after "،" and that solved the problem but a guy at Reddit said to report these issue here so it can be fixed. 

thanks

Steps to Reproduce:
1. Open Writer on Linux/Debian.
2. Change keyboard language to Persian (fa-IR).
3. Write a Persian sentence.
4. Change keyboard language to English (en-US).
5. Write a single English word. 
6. Change keyboard language to fa-IR again. 
7. Write Persian "،" separator with (CTRL + 7). 
8. Change keyboard language to English (en-US).
9. Write another English word. This new word should be to the left side of first word but it comes at right side!


Actual Results:
The order of English words is from left to right. 

Expected Results:
The order of English words should be from right to left in a Persian sentence. 


Reproducible: Always


User Profile Reset: No

Additional Info:
I can send you a sample .odt file as an example, if you like so.
Comment 1 Reza 2023-11-03 19:07:12 UTC
Created attachment 190639 [details]
Three sample sentence describing different situations.
Comment 2 Reza 2023-11-03 19:07:49 UTC
Created attachment 190640 [details]
The PDF format of previous attachment.
Comment 3 Tex2002ans 2023-11-03 22:49:13 UTC
Hey Reza,

Thanks for the bug report + sample documents.

1. Can you also post your exact info from:

- Help > About LibreOffice

2. And can you:

- Tell us which exact version of Debian you are using?

- - -

I also:

- Changed this bug from NEW -> UNCONFIRMED.
--- UNCONFIRMED is for a fresh, newly reported issue.
--- NEW means a 2nd person was able to reproduce your steps/problem, so we know it is an actual issue.
- Marked "earliest version" as 7.4.7.2, as you reported in your Reddit thread when you first found this problem.
- Marked this as a Right-to-Left issue.
--- This should get more eyes on it from the relevant testers/devs/users. :)

- - -

Note: This bug was originally reported on the LibreOffice subreddit:

- https://www.reddit.com/r/libreoffice/comments/17ibbpl/writing_multiple_english_words_separated_by/

There may be more discussion/info there.

Here is some of the relevant info:

1. Originally tested on:

> Version: 7.4.7.2 / LibreOffice Community
> Locale: fa-IR (en_US.UTF-8); UI: en-US

and also the 7.6.2 Flatpak.

(Both versions had this issue.)

2. Seems to happen in both ODT + DOCX.

3. This Unicode character seems to work correctly:

- ؛ = U+061B = ARABIC SEMICOLON

but this one DOES NOT:

- ، = U+060C = ARABIC COMMA
Comment 4 Reza 2023-11-04 05:33:16 UTC
Thanks you for correcting and adding relevant stuffs.

My LibreOffice info: 
Version: 7.6.2.1 (X86_64) / LibreOffice Community
Build ID: 56f7684011345957bbf33a7ee678afaf4d2ba333
CPU threads: 4; OS: Linux 6.1; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Flatpak
Calc: threaded


My OS info:
OS name: Debian GNU/Linux 12 (bookworm)
OS Type: 64-bit
GNOME version: 43.6
Windowing system: Wayland
Comment 5 M.Mahdi 2023-11-04 06:08:39 UTC
Hi. I can confirm this and can reproduce this.
Comment 6 ⁨خالد حسني⁩ 2023-11-04 09:40:06 UTC
This is how Unicode Bidirectional Text Algorithm works, the Arabic comma and semicolon have different didirectional class leading in different behaviour in situations like this.

See for how the direction of such text is resolved by the algorithm:
https://util.unicode.org/UnicodeJsps/bidi.jsp?a=x%D8%8C+y%0D%0Ax%D8%9B+y&p=RTL
Comment 7 Reza 2023-11-04 12:47:40 UTC
So, you are saying what is typed by "CTRL + 7" in Persian layout keyboard is s Arabic comma?
What is the solution for my case then? 
What if I wanna get my expected result? What Unicode character should I use instead of Arabic comma?
Comment 8 ⁨خالد حسني⁩ 2023-11-04 12:51:41 UTC
(In reply to Reza from comment #7)
> So, you are saying what is typed by "CTRL + 7" in Persian layout keyboard is
> s Arabic comma?
> What is the solution for my case then? 
> What if I wanna get my expected result? What Unicode character should I use
> instead of Arabic comma?


You already have the solution in your document; using “Right-To-Left Mark”.
Comment 9 Reza 2023-11-04 13:01:24 UTC
(In reply to ⁨خالد حسني⁩ from comment #8)

> You already have the solution in your document; using “Right-To-Left Mark”.

So, that's it :)
Comment 10 Reza 2023-11-04 19:10:09 UTC
Now, is it OK to change the status to "Resolved"?