Description: Double-clicking a text cell to edit it may put the character in some kind of invalid "mid-character" position that corrupts the text. See reproduction steps. Steps to Reproduce: 1. Install LibreOffice 7.4.0.3 on Windows. 2. Open the attached CSV. It contains 1 cel with 6 consecutive musical note characters. 3. In the import dialog, ensure the character set is set to UTF-8 4. Notice that the cell preview displays the content incorrectly at this step. 5. Click OK 6. Double-click in cell A1 anywhere between two of the note characters. 7. Press the "a" key Actual Results: The character preceding the editing carat is replaced with 2 damaged characters and an "a" character between them. Additionally, the cell edit box above the spreadsheet will no longer match the cell contents. Expected Results: The "a" character should be inserted between the note characters where the editing carat indicates. Reproducible: Always User Profile Reset: Yes OpenGL enabled: Yes Additional Info: Version: 7.4.0.3 (x64) / LibreOffice Community Build ID: f85e47c08ddd19c015c0114a68350214f7066f5a CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded
Created attachment 181960 [details] Test document
Forgot to mention: This does not happen if I press the left or right arrow keys to move the editing carat before step 7.
The sample file you attached has only one single character inside of it. Shouldn't it contain 6 characters? I opened it in a text editor and it only contains one musical note character. Could you check if this is the correct file?
I checked the file and it is correct. It should be 26 bytes, it contains the byte sequence "F0 9D 85 A0 F0" (U+1D160) six times, followed by "0D 0A" (CR LF) Downloaded the test document from Bugzilla and opened it on Windows and it works as described in the bug report.
Er sorry the repeated byte sequence is "F0 9D 85 A0"
[Automated Action] NeedInfo-To-Unconfirmed
(In reply to Eric Lasota from comment #5) > Er sorry the repeated byte sequence is "F0 9D 85 A0" I opened the CSV file in Okteta hex editor and confirmed this. It turns out that in Kate text editor as well as LibreOffice Calc, the six note characters are displayed on top of each other! So if you hit End key and start hitting backspace, you will remove the notes one by one. In nano editor and Notepad++ the characters are displayed in sequence. Bibisected with Linux 5.2 repo to https://git.libreoffice.org/core/commit/975c833943bab627eb461457ab1df35744b291cd upgrade harfbuzz version from 0.9.40 to 1.2.6 Not adding regression keyword as this is coming from a dependency.
The CSV import is not the issue here. Rather mouse pointer selection in a Calc cell or on the Input bar is able to split the highorder pair of SMP glyphs.
Paste this string into a Calc cell, unlike the musical notes these glyphs are available in DejaVu Sans U+1F060U+1F0A1U+1F060U+1F0A1U+1F060U+1F0A1U+1F060U+1F0A1 Position the text cursor at the end of the pasted text. Enter <Alt>+X and the 🁠 and the 🂡 glyphs will be toggled. Use mouse cursor to point select on any glyph, in the sheet or in the InputBar. Type any keyboard character, "a" as in the OP. The mouse click selection has split the multi-byte codepoint and entering the character "breaks" the glyph. As noted cursor (<L,R>) movement correctly recognizes the SMP glyphs, just the mouse pointer selection is wrong.
@Julien, when you fixed similar for sm for bug 102625 [1] is the Calc instance of the SMP glyphs missed here bcz the SMP glyphs are treated as an i18n COMPLEX script and no rtl::isSurrogate() test gets performed? =-ref-= [1] https://gerrit.libreoffice.org/c/core/+/93544
(In reply to V Stuart Foote from comment #10) > @Julien, when you fixed similar for sm for bug 102625 [1] is the Calc > instance of the SMP glyphs missed here bcz the SMP glyphs are treated as an > i18n COMPLEX script and no rtl::isSurrogate() test gets performed? > > =-ref-= > [1] https://gerrit.libreoffice.org/c/core/+/93544 The original patch in master might be better ref where Stephan B. had comments about use of the isSurrogate() test. https://gerrit.libreoffice.org/c/core/+/93684
(In reply to V Stuart Foote from comment #10) > @Julien, when you fixed similar for sm for bug 102625 [1] is the Calc > instance of the SMP glyphs missed here bcz the SMP glyphs are treated as an > i18n COMPLEX script and no rtl::isSurrogate() test gets performed? > > =-ref-= > [1] https://gerrit.libreoffice.org/c/core/+/93544 On pc Debian x86-64 with master sources updated today with gtk3 rendering here what I tested: - enter cell A1, then Ctrl-shift U 1F060 + Ctrl-shift U 1F060 + Ctrl-shift U 1F060 + Ctrl-shift U 1F060 to have 4 glyphs. - click on the end of the cell - Alt X => the last glyph disappears and is replaced with "U+1f0a1" I added some traces on the if else block: diff --git a/editeng/source/editeng/impedit2.cxx b/editeng/source/editeng/impedit2.cxx index 4e87e36af5d3..b00a6b8b8f46 100644 --- a/editeng/source/editeng/impedit2.cxx +++ b/editeng/source/editeng/impedit2.cxx @@ -4044,6 +4044,7 @@ sal_Int32 ImpEditEngine::GetChar( sal_uInt16 nScriptType = GetI18NScriptType( aPaM ); if ( nScriptType == i18n::ScriptType::COMPLEX ) { + fprintf(stderr, "COMPLEX\n"); uno::Reference < i18n::XBreakIterator > _xBI( ImplGetBreakIterator() ); sal_Int32 nCount = 1; lang::Locale aLocale = GetLocale( aPaM ); @@ -4058,6 +4059,7 @@ sal_Int32 ImpEditEngine::GetChar( } else { + fprintf(stderr, "NOT COMPLEX\n"); OUString aStr(pParaPortion->GetNode()->GetString()); // tdf#102625: don't select middle of a pair of surrogates with mouse cursor if (rtl::isSurrogate(aStr[nChar])) I got only NOT COMPLEX appearing on console logs.
(In reply to Julien Nabet from comment #12) > > I added some traces on the if else block: > diff --git a/editeng/source/editeng/impedit2.cxx > b/editeng/source/editeng/impedit2.cxx > index 4e87e36af5d3..b00a6b8b8f46 100644 > --- a/editeng/source/editeng/impedit2.cxx > +++ b/editeng/source/editeng/impedit2.cxx > @@ -4044,6 +4044,7 @@ sal_Int32 ImpEditEngine::GetChar( > sal_uInt16 nScriptType = GetI18NScriptType( aPaM ); > if ( nScriptType == i18n::ScriptType::COMPLEX ) > { > + fprintf(stderr, "COMPLEX\n"); > uno::Reference < i18n::XBreakIterator > _xBI( > ImplGetBreakIterator() ); > sal_Int32 nCount = 1; > lang::Locale aLocale = GetLocale( aPaM ); > @@ -4058,6 +4059,7 @@ sal_Int32 ImpEditEngine::GetChar( > } > else > { > + fprintf(stderr, "NOT COMPLEX\n"); > OUString aStr(pParaPortion->GetNode()->GetString()); > // tdf#102625: don't select middle of a pair of > surrogates with mouse cursor > if (rtl::isSurrogate(aStr[nChar])) > > > I got only NOT COMPLEX appearing on console logs. OK, thanks for the quick check, it was just a thought. I wasn't even sure if Calc's edit engine calls use that GetString(). Just that the same split on mouse cursor selection of a multi-byte glyph had been occuring in the sm Formula editor's input box.
(In reply to Julien Nabet from comment #12) > - click on the end of the cell > - Alt X > => the last glyph disappears and is replaced with "U+1f0a1" Yes I think that is correct. Going from glyph to Unicode value converts just the last character. As compared to going from Unicode notation (i.e. U+1F060U+1F0A1U+1F060U+1F0A1U+1F060U+1F0A1U+1F060U+1F0A1) which is "hungry" and converts the full run back to a white space. It needed to do that to handle combining diacritics.
Dear Eric Lasota, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Re-tested with latest version: Version: 24.8.4.2 (X86_64) / LibreOffice Community Build ID: bb3cfa12c7b1bf994ecc5649a80400d06cd71002 CPU threads: 12; OS: Windows 10 X86_64 (10.0 build 19045); UI render: Skia/Raster; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded Issue still occurs.