Bug 162502 - Treat direction-neutral characters with language according to their role in that language
Summary: Treat direction-neutral characters with language according to their role in t...
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: QA:needsComment
Keywords:
Depends on: 148257
Blocks: 129038 153378 RTL
  Show dependency treegraph
 
Reported: 2024-08-17 16:48 UTC by Eyal Rozenberg
Modified: 2024-09-21 15:49 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2024-08-17 16:48:50 UTC
This bug depends on 148257 having been fixed, i.e. when we have stretches of text which are explicitly/definitely marked as being in a certain language. When this is the case, we should qualify the application of the Unicode Bidirectional Algorithm when it comes to neutral characters like '-', '?', western-arabic digits etc.:

When a directionality-character is marked as being in a language in which it does not interrupt direction runs of text in that language, e.g. '-' for English where everything is LTR, we should treat is as a strongly-directional character with the  directionality of that language.

Thus, for example, if I write "-fax" in an RTL paragraph, the visual layout will be:

fax-

two runs, a 1-char RTL run and a 3-char LTR run. But if we mark this text as being in English, we should see:

-fax

a single LTR run despite - being a neutral character in general - because we know that the minus is part of a sequence of characters in English.


Caveat: Some languages may not have a single directionality, like Japanese; in which case we should either treat the character as neutral or apply some other logic.

------------------------------

Alternative, weaker option: Instead of treating the character as strongly-directional, "bias" the neutral character direction so that it takes its language's direction if the stretch of neutral chars has a stretch of chars in its language either before or after it.
Comment 1 Eyal Rozenberg 2024-09-21 15:49:24 UTC Comment hidden (invalid, obsolete)