In the attached document, the first line contains a combined emoji (ðšðŒ, U+1f468 "MAN" + U+1f3fc "EMOJI MODIFIER FITZPATRICK TYPE-3"). Putting the cursor immediately after the emoji, and pressing Alt+X, results not in the expected "U+1f468U+1f3fc" that would represent both elements of the emoji, but in "ðšU+1f3fc", i.e. "MAN" is still not converted into the text representing its code. For comparison, the second line has a combined character aÌ, U+0061 "LATIN SMALL LETTER A" + U+0301 "COMBINING ACUTE ACCENT". Pressing Ctrl+End to move after the character, and pressing Alt+X, results in both parts of the combined character to get converted: "U+0061U+0301". (There's some strange *different* issue that using a mouse to put cursor after the character, the result is as if you put cursor between them, but it's *unrelated* to the issue here). Tested with Version: 7.0.3.1 (x64) Build ID: d7547858d014d4cf69878db179d326fc3483e082 CPU threads: 12; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: en-US Calc: CL
Created attachment 167703 [details] Combined emoji and combined character
Interesting, it will convert both from HEX, i.e. U+1f468U+1f3fc to combined. But toggling the opposite way against the combined glyphs, only applies to the trailing glyph. It does not seem to have anything to do with the combining nature of the Emoji Modifiers or the SMP. Two BMP symbols â£â» U+2623U+267b will toggle from HEX to glyph, but only the trailing glyph is converted back. =-testing-= 2020-11-19 Version: 7.1.0.0.alpha1+ (x64) Build ID: ccd0e5f445d4a7d0e7aca6c23c02c61bf14510b2 CPU threads: 8; OS: Windows 10.0 Build 18363; UI render: Skia/Vulkan; VCL: win Locale: en-US (en_US); UI: en-US Calc: CL
Believe as implemented, https://gerrit.libreoffice.org/17535 present Version: 5.1.6.2 (x64) Build ID: 07ac168c60a517dba0f0d7bc7540f5afa45f0909 CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; Locale: en-US (en_US); Calc: CL
This might be VERY complex to do completely correctly, since there does not seem to be a single standard way of marking combining combinations of emojis. The latest version of the "Unicode Emoji" spec can be found at http://www.unicode.org/reports/tr51/. Having glanced through the spec, I imagine adding some kind of logic like: if ( maInput.getLength() == 0 ) bIsEmojiSequence = isEmoji(); if ( isEmoji_modifier_base() ) bHaveEmoji_modifier_base = true; const nZWJ == fe0f; //Zero Width Joiner character if ( bIsEmojiSquence ) { if ( next == nZWJ || (isEmoji(next) && !bHaveEmoji_modifier_base) ) then continue to accept new characters. } It looks like this will require some low-level identification of emoji, since there is no classification yet such as ::com::sun::star::i18n::UnicodeType::EMOJI
(In reply to Justin L from comment #4) Just a random idea: can't we use the same code that WrtShell uses when does its "step left"/"step right" magic, to identify what constitutes a single "character cell"?
(In reply to Justin L from comment #4) We might want to create a text cursor for the current view cursor [1], and use the text cursor to iterate over the positions, instead of iterating over the code points. [1] https://wiki.openoffice.org/wiki/Writer/API/Text_cursor
https://gerrit.libreoffice.org/c/core/+/107187 is a proof of concept using XTextCursor for Writer. It doesn't work for Calc/Draw/Math in current form, because the implementation needs XTextCursor for the edit engines there (which ought to be possible, but U can't work on that longer). Whoever wants to try to implement this, feel free to jump in and use the code as you like.
Created attachment 167829 [details] EmojiTest.odt: various emoji combinations - tried to be relatively comprehensive
Or maybe better make a writer-local change, using SwCursor::LeftRight and CRSR_SKIP_CELLS.
Dear Mike Kaganski, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug