Bug 129038 - Neutral characters intended to be LTR rendered as RTL in RTL paragraph
Summary: Neutral characters intended to be LTR rendered as RTL in RTL paragraph
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsDevAdvice
Depends on: 162502
Blocks: RTL
  Show dependency treegraph
 
Reported: 2019-11-26 11:26 UTC by travis82
Modified: 2024-08-18 04:40 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Microsoft Office rendering (35.66 KB, image/png)
2021-03-29 20:06 UTC, Eyal Rozenberg
Details
Libreoffice rendering (47.98 KB, image/png)
2021-03-29 20:06 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description travis82 2019-11-26 11:26:15 UTC
Description:
As I described here with pics: https://ask.libreoffice.org/en/question/216996/odd-behavior-in-typing-some-characters-in-bidi-texts/

The problem occurs when users try to put ltr phrases in rtl paragraphs or vice versa, if the boundary characters consists of non-letters characters, they receive the direction property of the underlying paragraph. This leads to misaligned characters in phrase. 

A temporary solution is creating a strong directionality barrier between the language sequences using Insert → Formatting mark → Left-to-right mark (this won't work in the first line of paragraph tough, why? I don't know)

Long-term solution treating non-letters characters as letters in LO. This is what Microsoft office does



Steps to Reproduce:
1. create a rtl paragraph and write something
2. change keyboard layout and write and ltr word that negins with $ (for expample)
3.

Actual Results:
$ will be misaligned and locates at the end of ltr word

Expected Results:
it should locate at the beginning of the word


Reproducible: Always


User Profile Reset: No



Additional Info:
s
Comment 1 Xisco Faulí 2019-11-28 12:09:10 UTC
You can't confirm your own bugs. Moving it back to UNCONFIRMED until someone
else confirms it.
Comment 2 Xisco Faulí 2020-04-06 15:13:48 UTC
Thank you for reporting the bug.
Could you please try to reproduce it with the latest version of LibreOffice from https://www.libreoffice.org/download/libreoffice-fresh/ ?
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the bug is still present in the latest version.
Comment 3 QA Administrators 2020-10-04 03:50:16 UTC Comment hidden (obsolete)
Comment 4 travis82 2020-10-04 05:56:48 UTC
The bug reproduced in LO 7.0.1.2 on Manjaro Linux. So I changed Status to UNCONFIRMED again.
Comment 5 Riyadh 2021-01-16 19:39:50 UTC
OS: Win 10 (x64)
Language: Arabic (with English text for bidi test)
I tested this on: Release 7.0.4.2 (x64) and RC2 7.1.0.1 (x64).
In both I got the same result.
Comment 6 Riyadh 2021-01-23 05:53:16 UTC
Reproduced on:

Version: 7.0.4.2 (x64)
Build ID: dcf040e67528d9187c66b2379df5ea4407429775
CPU threads: 16; OS: Windows 10.0 Build 18363; UI render: Skia/Vulkan; VCL: win
Locale: ar-IQ (ar_IQ); الواجهة: ar-SA
Calc: CL

Version: 7.1.0.1 (x64)
Build ID: b585d7d90ab863bf29b2d110c174c0c2a98f3ee4
CPU threads: 16; OS: Windows 10.0 Build 18363; UI render: Skia/Vulkan; VCL: win
Locale: ar-IQ (ar_IQ); UI: ar-SA
Calc: CL

Version: 7.2.0.0.alpha0+ (x64)
Build ID: 9f9798f07f0b56ae474f31ded671cc8da598d244
CPU threads: 16; OS: Windows 10.0 Build 18363; UI render: Skia/Vulkan; VCL: win
Locale: en-US (ar_IQ); UI: ar-SA
Calc: CL
Comment 7 Riyadh 2021-01-23 06:02:56 UTC
Maybe someone on qa team will change status. Sorry.
Comment 8 Dieter 2021-01-23 10:21:39 UTC
(In reply to Riyadh from comment #7)
> Maybe someone on qa team will change status. Sorry.

I don't have to say sorry. If you can confirm a bug, you are allowed to change status to NEW.
Comment 9 Eyal Rozenberg 2021-03-29 20:03:24 UTC
travis82, what you are describing is not in itself a bug. Le me rephrase your reproduction instructions:

1. In an RTL paragraph, enter some RTL text.
2. Enter a directionality-neutral character.
3. Enter some LTR text.

Why should the neutral character be part of the LTR run, rather than the RTL run? It _should_ be the case that the paragraph direction decides here.

If I am not mistaken, what you are actually complaining about is how, when one changes the keyboard input language from an RTL to an LTR one, LO ignores the user's intent of inserting LTR text. I believe that is the difference between LO and MS Office your Ask.LO post describes.
Comment 10 Eyal Rozenberg 2021-03-29 20:06:07 UTC
Created attachment 170814 [details]
Microsoft Office rendering
Comment 11 Eyal Rozenberg 2021-03-29 20:06:34 UTC
Created attachment 170815 [details]
Libreoffice rendering
Comment 12 Eyal Rozenberg 2021-03-29 20:11:23 UTC
I'm not 100% sure this should block language-detection, since it's more of a "don't force language detection" issue, but still.
Comment 13 travis82 2022-02-20 11:45:25 UTC
Here is a good document about the issue.

https://docs.microsoft.com/en-us/dynamics365/fin-ops-core/dev-itpro/user-interface/bidirectional-support
Comment 14 ⁨خالد حسني⁩ 2023-06-26 14:03:40 UTC
So basically MS has its own non-standard bidirectional text algorithm. We can’t implement that, not by default anyway. If we want to follow MS here it would be as some sort of compatibility mode for MS file format and may be and ODF option as well.

The very first step would to find an exact detailed documentation of MS algorithm, or someone volunteers to reverse engineer it and provide such documentation.

My personal position is that Unicode bidi algorithm have enough provisions to “capture user intent”, we need more ergonomic ways to utilize them.
Comment 15 ⁨خالد حسني⁩ 2023-06-26 14:19:11 UTC
MS behaviour depends heavily on keyboard input, if you copy the exact same text from a web page it will show it like LO. Which is another advantage of the Unicode’s implicit way of handling this using control characters; the “user intent” is captured in the text itself, not external to it.
Comment 16 Eyal Rozenberg 2024-08-16 19:04:46 UTC
How about the following solution?

1. Fix bug 148257, i.e. text can now have an explicit language set.
2. Characters which are neutral in-themselves, but are marked as being in a language with only one directionality (e.g. English) - will be considered strongly-directional.

(why "only one directionality"? Because some languages can in principle be written in both LTR and RTL in some settings, e.g. Japanese, see:

https://www.sljfaq.org/afaq/right-to-left.html
)
Comment 17 Eyal Rozenberg 2024-08-16 19:11:05 UTC
... and I should mention that even right now, you can enter an LRM (left-to-right mark) after the neutral characters, and they will be rendered LTR, continuing the LTR run as the user (may have) intended.
Comment 18 travis82 2024-08-18 04:40:03 UTC
This solution doesn't work for all neutral characters. For example if you want to begin a ltr word with $ caharacter in a rtl paragraph, even after inserting LRM, LO locates $ at the end of the word.
Also note that this problem is not limited to the beginning of the words. If you want to write name of a function in a rtl paragraph with two parentheses at the end of the word, Lo shows those parentheses at the beginging. For instance lmtest() function rendered as ()lmtest in rtl paragraph.