Bug 70633 - Writer EDITING: Some IME unable to use AltGr to produce supplementary plane chars
Summary: Writer EDITING: Some IME unable to use AltGr to produce supplementary plane c...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.3.0
Keywords: bibisected, regression
Depends on:
Blocks: Shortcuts-AltGR
  Show dependency treegraph
 
Reported: 2013-10-18 21:23 UTC by nrs
Modified: 2021-07-04 23:27 UTC (History)
10 users (show)

See Also:
Crash report or crash signature:


Attachments
Ahmao Unicode keyboard + layout (335.01 KB, application/zip)
2013-10-18 21:27 UTC, nrs
Details
Tengwar test font, keyboard, & layout (465.88 KB, application/zip)
2014-02-26 06:35 UTC, nrs
Details
picking SMP glyphs via Special Character dialog (58.36 KB, image/png)
2019-05-14 23:18 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description nrs 2013-10-18 21:23:47 UTC
It is not possible to type into Writer any *supplementary plane* chars mapped to the AltGr state.  These chars simply do not appear on screen, on export, or on printing.  By contrast, AltGr-mapped *BMP* chars are OK, as are *supplementary plane* chars mapped to normal or shift states.  There is also no problem typing AltGr-mapped *supplementary plane* chars into Calc & Impress.  These chars can also be readily produced with AltGr in other apps like Word & Excel.

This problem is attested in the following builds & OS:
- LibO 4.1.1.2 on 64-bit English Windows 7 Ultimate SP1
- LibO 4.1.2.3 on 64-bit English Windows 7 Ultimate SP1

Steps to reproduce bug:
1. download & install Miao Unicode font at https://github.com/phjamr/MiaoUnicode/blob/master/MiaoUnicode-Regular.ttf?raw=true
2. unzip attached archive & install Ahmao keyboard (run setup.exe)
3. run Writer & set all Basic Fonts (CTL) to Miao Unicode (in Tools|Options|LibreOffice Writer)
4. select Miao Unicode font
5. on Windows language bar, select MR
6. refer to the 3 .jpg files in attached archive for keys mapped in each state
7. type any mapped keys in normal state: Miao chars (mostly big letters) appear in document -- expected
8. type any mapped keys in shift state: Miao chars (mostly small letters) appear in document -- expected
9. type any mapped keys in AltGr state: *nothing appears* -- BUG!!!
10. select 'Export to pdf' or 'Print' from the File menu: no sign of any chars typed in #9 -- BUG!!!

Control experiments with other apps:
11. open Calc/Impress/Word/Excel/PowerPoint: blank document appears
12. repeat #4-8: expected results
13. type any mapped keys in AltGr state: Miao chars (big letters with wart) appear in document -- expected

Further observations: Writer may be cutting off the 2nd half of the surrogate pair that represents a supplementary char, as suggested by the following test:
14. in Writer, repeat #7-8 in any order multiple times, keeping all chars on 1 line
15. repeat #9 multiple times, continuing on the same line as in #14
16. copy the whole line & paste into Word: a line of blanks appears with trailing 'checked' boxes
17. in Word, place cursor after any 'checked' boxes
18. press Ctrl-X: the string D81B appears, which is the Unicode code point of the 1st half of any surrogate pair representing chars in this (Miao) block

Further control experiment with BMP chars (Hebrew):
19. download & install SBL Hebrew font at http://www.sbl-site.org/Fonts/SBL_Hbrw.ttf
20. download & install SBL Hebrew Tiro keyboard at http://www.sbl-site.org/Fonts/BiblicalHebrewTiro.zip
21. download & refer to SBL Hebrew Tiro keyboard layout at http://www.sbl-site.org/Fonts/BiblicalHebrewTiroManual.pdf
22. in Writer, select SBL Hebrew font
23. on Windows language bar, select HE
24. type any mapped keys in any state: Hebrew chars appear as expected
Comment 1 nrs 2013-10-18 21:27:11 UTC
Created attachment 87837 [details]
Ahmao Unicode keyboard + layout
Comment 2 nrs 2013-10-18 21:32:05 UTC
BTW, the Miao script is used by multiple language groups with as many as 6 million people.  It would be most appreciated if you could fix this ASAP.  Thank you!
Comment 3 nrs 2013-12-19 21:54:54 UTC
This bug is also attested in the following LibO versions, all on 64-bit English Windows 7 Ultimate SP1:
- 3.3.1.2
- 3.5.2.2
- 4.1.3.2
- 4.1.4.2

Symptoms (in step 9) are not exactly the same but similar in nature:

LibO 3.3.1.2: a square appears & then changes to crossed boxes on typing further chars mapped to AltGr
LibO 3.5.2.2: nothing appears
LibO 4.1.3.2: similar to 3.3.1.2 plus random line splitting or glyph change on screen which will revert to original on further typing
LibO 4.1.4.2:
- if a char mapped to AltGr starts the line, a square appears & subsequent consecutive AltGr-mapped chars on the same line are not rendered at all
- otherwise, chars mapped to AltGr are rendered as crossed boxes
- these symbols (square/crossed box/nothing) will randomly switch among themselves so that sometimes part of the line just disappears & comes back again on typing further chars

Version field set to 3.3.1 release to reflect earliest occurrence attested.
Comment 4 nrs 2014-02-26 06:33:22 UTC
Another case of Writer 4.1.4.2 blocking output of SMP chars mapped to AltGr & shift-AltGr states:

Steps to reproduce bug with Tengwar (re-encoded at U+1CC00..1CC7F for testing):
1. unzip tengmod.zip attached
2. install Tengwar font (right-click tengmod.ttf & select Install)
3. install Tengwar keyboard (run setup.exe)
4. run Writer
5. select Tengwar Telcontar Mod font
6. on Windows language bar, select IS
7. refer to the 4 .jpg files in attached archive for keys mapped in each state
8. type any mapped keys in AltGr or shift-AltGr state: squares & nothing appear -- BUG!!!

According to dev comments re a parallel bug in AOO, this appears to be caused by processing Unicode codepoints via the operator ::rtl::OUString::operator sal_Unicode*, which is defined as only 16-bit unsigned int, effectively dropping all codepoints beyond BMP (thus the surrogate pair cut-off phenomenon).  Pls. kindly revise the code base in concern to use a 32-bit type & fix this defect ASAP.
Comment 5 nrs 2014-02-26 06:35:12 UTC
Created attachment 94751 [details]
Tengwar test font, keyboard, & layout
Comment 6 nrs 2014-02-26 07:24:19 UTC
Attested also in Writer 4.2.1.1 for both Miao & Tengwar.
Comment 7 Björn Michaelsen 2014-03-06 16:54:04 UTC
Enhancement request as per comment 4. Please also add the mentioned bug in 'see also:'
Comment 8 Urmas 2014-03-06 17:48:09 UTC
The parallel bug for OO has no comments.

Also, the proper keyboard handling is anything but enhancement.
Comment 9 nrs 2014-03-06 18:12:51 UTC
Added link to parallel OO bug #124312.  Note that OO dev added a link in the 'Blocks' field to the possible root cause (per comment #4).  That 2nd link is also included in the 'See Also' field here.  For the sake of the several million Miao script users, pls. kindly look into this issue at your earliest convenience.  Thanks!
Comment 10 QA Administrators 2014-10-05 23:05:36 UTC Comment hidden (obsolete)
Comment 11 QA Administrators 2014-11-02 16:43:03 UTC Comment hidden (obsolete)
Comment 12 nrs 2014-11-20 07:51:04 UTC
We already provided the requested info 8.5 months ago (see comment 9 added on 2014-03-06 18:12:51 UTC, 24 mins. after comment 8 & 78 mins after comment 7).  We don't know what more info you need.  Pls. specify.

We just tested with the latest release 4.3.4.1 on 32-bit English Windows Vista Business & the bug is still there:
- No chars in the AltGR & shift-AltGR states can be produced.
- For Ahmao, nothing appeared at all (rf. Description, step #9).
- For Tengwar, spaces appeared instead (cf. Comment 4, step #8).
- When the 'empty' string is copied into Word 2003 & Alt-X pressed,
  - D81B appeared for Ahmao (rf. Description, step #18; NB: 'Ctrl-X' should read 'Alt-X').
  - D833 appeared for Tengwar.

Apparently the surrogate cut-off problem is still not resolved.  Pls. kindly fix it asap.  Thanks!
Comment 13 nrs 2015-11-20 03:59:32 UTC
It's been a year already but nothing has happened apart from defect confirmation.  Pls. kindly follow up asap.  Thanks!
Comment 14 V Stuart Foote 2016-09-05 21:05:27 UTC
With a current build of LO master/5.3.0alpha0+ (Build ID: 696e83b663d4f3e00f23947613f9f3916a4dd14d) on install of Miao unicode font linked in comment 0 -- the Special Character dialog displays and selects glyphs and allows paste of SMP codepoints into the Writer canvas.  The <alt>+X unicode toggle also seems to work correctly for the SMP codepoints.

Could you check if your preferred AltGr based text IME remains balky in Writer for entering SMP codepoints for the font.
Comment 15 Xisco Faulí 2017-09-29 08:51:05 UTC Comment hidden (obsolete, spam)
Comment 16 nrs 2019-05-14 22:57:10 UTC
This issue was confirmed to be resolved in the following build & OS:
- Version 5.4.4.2 (x64) on 64-bit English Windows 7 Ultimate SP1

*HOWEVER*, it is still attested in the following NEWER builds & OS:
- Version 6.0.5.2 (x64) on 64-bit English Windows 7 Ultimate SP1
- Version 6.1.4.2 (x64) on 64-bit English Windows 10 Pro
- Version 6.2.3.2 (x64) on 64-bit English Windows 10 Pro

A minor difference is instead of squares, black diamonds with a white question mark inside are produced.  On pressing Alt-X, the same surrogate values show up as before.

This indicates other fixes in the code base for 6.x has broken the fix for this issue in 5.4 => regression!
Comment 17 V Stuart Foote 2019-05-14 23:18:51 UTC
Created attachment 151415 [details]
picking SMP glyphs via Special Character dialog

(In reply to nrs from comment #16)
On Windows 10 Ent 64-bit en-US (1807) with
Version: 6.2.4.1 (x64)
Build ID: 170a9c04e0ad25cd937fc7a913bb06bf8c75c11d
CPU threads: 8; OS: Windows 10.0; UI render: GL; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: CL

Can not confirm mishandling of SMP Unicode glyphs. The IME may be wrong, but handling to document canvas with Special Character dialog, or <Alt>+X toggle correctly assigns the SMP Unicode glyphs.
Comment 18 nrs 2019-05-15 01:32:34 UTC
Thank you for your quick response!  I can also get the desired char via the Special Char dialog, but pls. kindly reread comment #1: This issue is about *typing*, not about pointing & clicking.  And the various control experiments with other apps, including Calc, clearly confirm that the keyboards are working.

The only conclusion possible is that:
- either something in the Writer 6.x code base broke the fix that was in place in 5.4, or
- the fix available in 5.4 is not incorporated into Writer 6.x at all.

Pls. kindly double-check.  Thanks!
Comment 19 Buovjaga 2020-06-04 15:20:11 UTC
(In reply to nrs from comment #16)
> This issue was confirmed to be resolved in the following build & OS:
> - Version 5.4.4.2 (x64) on 64-bit English Windows 7 Ultimate SP1
> 
> *HOWEVER*, it is still attested in the following NEWER builds & OS:
> - Version 6.0.5.2 (x64) on 64-bit English Windows 7 Ultimate SP1
> - Version 6.1.4.2 (x64) on 64-bit English Windows 10 Pro
> - Version 6.2.3.2 (x64) on 64-bit English Windows 10 Pro

Let's raise the version, then.

It is possible for you to investigate this deeper:
https://wiki.documentfoundation.org/QA/Bibisect
https://wiki.documentfoundation.org/QA/Bibisect/Windows

It requires a bit of patience to set up.

Before bibisecting, do check with a fresh master build to confirm the issue is still present: https://dev-builds.libreoffice.org/daily/master/current.html Win-x86_64@tb77-TDF
Comment 20 Sahamokou 2021-06-04 02:52:37 UTC
> (In reply to nrs from comment #16)
> > This issue was confirmed to be resolved in the following build & OS:
> > - Version 5.4.4.2 (x64) on 64-bit English Windows 7 Ultimate SP1
> > 
> > *HOWEVER*, it is still attested in the following NEWER builds & OS:
> > - Version 6.0.5.2 (x64) on 64-bit English Windows 7 Ultimate SP1
> > - Version 6.1.4.2 (x64) on 64-bit English Windows 10 Pro
> > - Version 6.2.3.2 (x64) on 64-bit English Windows 10 Pro
> 
> Let's raise the version, then.
> 
> It is possible for you to investigate this deeper:
> https://wiki.documentfoundation.org/QA/Bibisect
> https://wiki.documentfoundation.org/QA/Bibisect/Windows
> 
> It requires a bit of patience to set up.
> 
> Before bibisecting, do check with a fresh master build to confirm the issue
> is still present:
> https://dev-builds.libreoffice.org/daily/master/current.html
> Win-x86_64@tb77-TDF

I confirm this issue is still present:
AltGr; Still get 1st half of any surrogate pair or 1st codepoints of any ligatures.
Year has passed.
Comment 21 Sahamokou 2021-06-04 18:55:07 UTC
3326f9057b834e67857128c42ea2d0038eeaa374 is the first bad commit
commit 3326f9057b834e67857128c42ea2d0038eeaa374
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Tue Dec 12 20:40:57 2017 -0800

    source sha:f53b3b547b04dc112076d8323b5b24ae178d6260

    source sha:f53b3b547b04dc112076d8323b5b24ae178d6260

 instdir/program/basctllo.dll | Bin 1362944 -> 1362944 bytes
 instdir/program/vcllo.dll    | Bin 7465984 -> 7465984 bytes
 instdir/program/version.ini  |   2 +-
 3 files changed, 1 insertion(+), 1 deletion(-)

# bad: [bc1845d882e52469a4583747881a465749177829] source sha:c30963b8b4bbbe42a24b97aafa161eff9d7ccdd4
# good: [cc5c4c7ed1d8d01b0063bcaaeb5f6d59282c8029] source sha:9feb7f7039a3b59974cbf266922177e961a52dd1
git bisect start 'master' 'oldest'
# good: [d7692604a13504d34902e44875ec2ce58fa03e6a] source sha:113ba194f8fd10baa8dd97dff8c2619d059ba99e
git bisect good d7692604a13504d34902e44875ec2ce58fa03e6a
# good: [df9716c59c68a66044573df9b691944245cc0624] source sha:4913a117f8be045b3b1e2f2ef09d7f6a85ff076b
git bisect good df9716c59c68a66044573df9b691944245cc0624
# bad: [ff80794b50a198d7d08dbd1dbabbbc0500c9cb53] source sha:bb0fdccaac9495628e67d1ad1812e95b1c9397ba
git bisect bad ff80794b50a198d7d08dbd1dbabbbc0500c9cb53
# good: [37a7c4316e0b826e2635da29d311e58c7a9cd44b] source sha:d945bc4598f75e4cb1a1bf9df8942f14ef065d74
git bisect good 37a7c4316e0b826e2635da29d311e58c7a9cd44b
# bad: [46bb39f994da658c9e7d2e2987718dd584d097a8] source sha:2865210607364feaff2c0275b7cd6c5439f5f070
git bisect bad 46bb39f994da658c9e7d2e2987718dd584d097a8
# good: [1926ce79de3d2284188db51241416434fc1eaf98] source sha:c74f6d3c64b943e26d5af1850bb55780b60602d6
git bisect good 1926ce79de3d2284188db51241416434fc1eaf98
# good: [fa46eb6b51799682682f51953aaf12fb656a7263] source sha:713f579283279aa1dfadf476d37b38753e5f398f
git bisect good fa46eb6b51799682682f51953aaf12fb656a7263
# bad: [c725b0ee6d9e64e1d8341ce55250ca3799980f13] source sha:1d097883541b9d244e50ced7fe49a4d7a0f65cfd
git bisect bad c725b0ee6d9e64e1d8341ce55250ca3799980f13
# bad: [5bfd9eeec85b4c1d75a4255e6789f06d12dab849] source sha:a7ec994689f8ea5985f6c8f94f17a4417978ff41
git bisect bad 5bfd9eeec85b4c1d75a4255e6789f06d12dab849
# good: [a01dc34f93927894c9c9c462a47388f651a2728a] source sha:328cdfd4a75f5e29c3a1b3ba4ee0ed9475603442
git bisect good a01dc34f93927894c9c9c462a47388f651a2728a
# good: [2ce8641c856d5ac623feca59de544efb8164c239] source sha:e8871a5ec9fc9b58ee688c9f1d9b22223769ea57
git bisect good 2ce8641c856d5ac623feca59de544efb8164c239
# bad: [40e63115b40655443fdaecf9948fb4899772b248] source sha:c1c868003e129ff286ccd787e22f1a64a75de58a
git bisect bad 40e63115b40655443fdaecf9948fb4899772b248
# bad: [3326f9057b834e67857128c42ea2d0038eeaa374] source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
git bisect bad 3326f9057b834e67857128c42ea2d0038eeaa374
# first bad commit: [3326f9057b834e67857128c42ea2d0038eeaa374] source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
Comment 22 V Stuart Foote 2021-06-04 20:47:07 UTC
(In reply to Sahamokou from comment #21)
> 3326f9057b834e67857128c42ea2d0038eeaa374 is the first bad commit
> commit 3326f9057b834e67857128c42ea2d0038eeaa374
> Author: Norbert Thiebaud <nthiebaud@gmail.com>
> Date:   Tue Dec 12 20:40:57 2017 -0800
> 
>     source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
> 
>     source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
>...
> git bisect bad 3326f9057b834e67857128c42ea2d0038eeaa374
> # first bad commit: [3326f9057b834e67857128c42ea2d0038eeaa374] source
> sha:f53b3b547b04dc112076d8323b5b24ae178d6260

That was done [1] for other mishandling of the <AltGr> and bug 97908 reversing work on bug 95761--it was only temporarily functional. Not really a regression, setting back to inherited from OOo.

@Thurston, Jurgen any thoughts on how to handle more involved IME for CTL scripts that using <AtltGR> key entry more so than typical dead key handling for Western European languages? Than those that initially had gummed this up? 

The seealso linked OOo era mishandling of Unicode SMP highorder surrogate pairs   suggest the IME is not well handled. Though our <Alt>+X and Special Character Dialog have no issues with the Unicode conversions or font selection for getting them onto VCL canvas.


=-ref-=
[1] https://gerrit.libreoffice.org/c/core/+/44824/
Comment 23 Sahamokou 2021-06-06 13:52:02 UTC
(In reply to V Stuart Foote from comment #22)
> (In reply to Sahamokou from comment #21)
> > 3326f9057b834e67857128c42ea2d0038eeaa374 is the first bad commit
> > commit 3326f9057b834e67857128c42ea2d0038eeaa374
> > Author: Norbert Thiebaud <nthiebaud@gmail.com>
> > Date:   Tue Dec 12 20:40:57 2017 -0800
> > 
> >     source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
> > 
> >     source sha:f53b3b547b04dc112076d8323b5b24ae178d6260
> >...
> > git bisect bad 3326f9057b834e67857128c42ea2d0038eeaa374
> > # first bad commit: [3326f9057b834e67857128c42ea2d0038eeaa374] source
> > sha:f53b3b547b04dc112076d8323b5b24ae178d6260
> 
> That was done [1] for other mishandling of the <AltGr> and bug 97908
> reversing work on bug 95761--it was only temporarily functional. Not really
> a regression, setting back to inherited from OOo.
> 
> @Thurston, Jurgen any thoughts on how to handle more involved IME for CTL
> scripts that using <AtltGR> key entry more so than typical dead key handling
> for Western European languages? Than those that initially had gummed this
> up? 
> 
> The seealso linked OOo era mishandling of Unicode SMP highorder surrogate
> pairs   suggest the IME is not well handled. Though our <Alt>+X and Special
> Character Dialog have no issues with the Unicode conversions or font
> selection for getting them onto VCL canvas.
> 
> 
> =-ref-=
> [1] https://gerrit.libreoffice.org/c/core/+/44824/


Sorry, I don't quite understand the first sentence you wrote.
In as far as I have tried more.
I believe it's the same bug as https://bugs.documentfoundation.org/show_bug.cgi?id=127072 .
No problem with surrogate pairs directly.
CTL, CJK, even Basic Latin (Precomposed character) also have problems.
The second codepoits or more have been discarded (when type via AltGr)

Regards.
Comment 24 Caolán McNamara 2021-06-30 15:04:10 UTC
My effort at https://gerrit.libreoffice.org/c/core/+/118170 appears to make the described initial case in comment #1 work. Whether there are unintended consequences though is an currently unknown.
Comment 25 Commit Notification 2021-06-30 19:48:30 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/ce9e6972148c657994beb74f671e51bec5be6689

tdf#70633 unset Alt if detected as AltGr in both KeyInput branches

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 26 Sahamokou 2021-07-04 23:27:04 UTC
On a keyboard that have AltGr key:
AltGr or Alt+Ctrl; 
Now can sends any surrogate pair or ligatures codepoints.
(like AltGr+6 send SiX)

On a normal US keyboard it sends a shortcut as usual.