Created attachment 177315 [details] Incorrectly positioned wrapped Hebrew text Description: -- When RTL text wraps across lines within an LTR paragraph, the first line of RTL text is not left aligned as expected. This issue is very noticeable when the RTL text follows an opening bracket but it seems to exist irrespective of the text that precedes the RTL string. System: -- Linux 5.15.8-arch1-1 LibreOffice 7.2.4.1 Fonts: Times New Roman and SBL Biblit To Reproduce: -- 1. Create a new document 2. Start with Lorem Ipsum and add RTL text close to the end of the line so that it must wrap. For me, this was sufficient: > This is just a test document. This is just a test document. This is just a test document. RTL: (לא־תהיה אחרי־רבים לרעת) I would expect the leftmost portion of the Hebrew text to be adjacent to the opening bracket.
Is more information needed to confirm this bug?
(In reply to jcuenod from comment #0) > I would expect the leftmost portion of the Hebrew text to be adjacent to the > opening bracket. I wouldn't expect this. Could you please explain? I think, the problem is described in bug 146572. What problem remains, if bug 146572 is fixed? => NEEDINFO
I believe that this bug is distinct from bug 146710 (which is the reference I think you intended). 146710 is about the attachment of neutral characters in the wrapping algorithm. This bug can be produced without neutral characters. In this bug, I expect a wrapped RTL string in an LTR paragraph to be left aligned. In the attachment, we see the actual output: white-space between the opening bracket and the leftmost Hebrew character.
> the first line of RTL text is not left aligned as expected Not sure what you mean exactly, going by this sentence: > I would expect the leftmost portion of the Hebrew text to be adjacent to the opening bracket. Why? Why is it "better" to have the space on the next line, or nowhere, rather than on the first line? It serves to indicated that the parenthesis doesn't come right after the תהיה. But even ignoring the intuition above - what's the formal basis (here: http://www.unicode.org/reports/tr9/tr9-23.html I would think) for your expectation?
I think my last reply may add clarification on what I expect. I think you interpreted me correctly. But just to make another attempt at clarification in case it's needed: I would expect that, when a string of RTL text wraps across a line of left aligned text (because the rest of the paragraph is LTR), the span of RTL text that remains on the first line would left align with the rest of the paragraph. I think that the same intuition is why I expect to see the span of RTL text that continues on the second line to be on the *left* of that line with LTR text continuing on the right. If the unicode spec says this is not how it should be, I apologise. Anecdotally, however, this is how I've seen other editors lay out such text (e.g. Google docs) and this is how I've observed it in journal articles. If you need a list of examples, I will try to dig some up. I don't have a formal basis for this intuition but I'm not convinced that the unicode docs you are linking specify preserving whitespace in two directions at the end of a wrapped line either (which, I assume, is what is producing the [imo] buggy output).
Created attachment 177842 [details] This gif demonstrates the bidi layout bug
After displaying "Formatting Marks" (under View), I can confirm that the problem is that whitespace at the end of the line seems to be conditionally displayed. In a line of BIDI text, both the LTR and the RTL text display their whitespace which produces an unexpected amount of space between the two spans of text. This can be reproduced by: 1. Display formatting marks 2. Type half a line of LTR text followed by a space and enough RTL text till it wraps to the next line (must contain multiple words). 3. Add LTR characters at the beginning of the first line until another RTL word wraps. Two "space" formatting marks will appear between the LTR and RTL text. There should only be one.
(In reply to jcuenod from comment #5) Actually, that doesn't make things clearer. Paragraph alignment is basically orthogonal paragraph direction. And the span of RTL text does not "align" anywhere; the only questions are: 1. Should the space character(s) between the last word fitting on the first line and the first word on the next line should be kept on the first line, assuming they fit, or moved to the next one? 2. If they are kept on the first line, should they appear on the left side, between the LTR span and the last letter of the last RTL word, or on the right side, to the right of the first RTL word? LO Writer answers this with "Kept on the first line" and "on the left side". I am not sure what Unicode standard says! It won't hurt to check using MS-Word and MS Notepad what Microsoft does. Using the XFCE mousepad editor, I notice the answers are "kept on the first line" and "on the right side" - unlike the LO logic, and perhaps more in line with what you expect. So does GNOME's gedit. AbiWord acts like LO Writer though. > Anecdotally, however, this is how I've seen other editors lay out such text > (e.g. Google docs) and this is how I've observed it in journal articles. Can you attach some screenshots? > If you need a list of examples, I will try to dig some up. Windows examples would be pertinent. If you can get them, please do.
NEEDINFO per last comment
This is a bug, the trailing space at line break should be at the end of the line even for embedded RTL text.
Rephrasing title as per Khaled's last comment. We should probably create a test document here.