Description: LibreOffice Calc's 'AutoFilter' popup menu treats combining diacritic letters (like U+0364, U+0366) and modifier letters (like U+1D49) as identical to their plain equivalents in the value list. For instance, in a column that contains both 'tuon' and 'tuͦn' (variant spellings of Middle High German for 'to do'), only one variant will be listed (see screenshot). As expected, filtering for the listed variant indeed only lists those cells in which that variant appears verbatim, but not the other variant. Steps to Reproduce: 1. Create a list of words in a column containing words with a plain, regular letter and the corresponding combining diacritic letter or modifier letter in its place (e.g. tuon, tuͦn; guet, guͤt; vröude, vröudͤ), make the first cell in the column some field identifier like 'mylist', 'example', or '123'. 2. Data > AutoFilter 3. OK, use first line as header 4. Click the dropdown arrow in the header of the column that contains our word list Actual Results: Only one variant of the word is listed in the value list: either with the plain letter (tuon, guet, vröude) or the combining/modifier letter (tuͦn, guͤt, vröudͤ). Expected Results: Both variants are listed—AutoFilter doesn't treat superscript/modifier letters the same as their plain equivalents, i.e. all of tuon, tuͦn, guet, guͤt, vröude, vröudͤ are listed as values occuring in the selected column. Reproducible: Always User Profile Reset: No Additional Info: Screenshot: https://i.imgur.com/DTZhV6e.png Version: 6.0.7.3 Build-ID: 1:6.0.7-0ubuntu0.18.04.5 CPU-Threads: 4; OS: Linux 4.15; UI-Render: Standard; VCL: kde4; Localization schema: de-DE (de_DE.UTF-8); Calc: group
That should have been 'vröudᵉ', not *vröudͤ.
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. (Please note that the attachment will be public, remove any sensitive information before attaching it. See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.) I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
Created attachment 152087 [details] Example as requested by comment #2 Added example file as requested by Xisco Faulí in comment #2
Confirm with Version: 6.4.0.0.alpha0+ Build ID: 2812610f4f39ed5892da08864893c758325d1d39 CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; but not in LO Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)
Created attachment 153646 [details] bisect result bibisected with bibisect-41max. The result is a range of 70 commits.
crash did happen with official nightly https://git.libreoffice.org/core/+log/4c4b3218b8595a9809ffade0cfd064f3d9335dff but does not happen with my local build, which was rebased to commit 0310c591eb5f4010de8f6d12298491d9c13634a3 I will test if this bug happens with the latest nightly.
Oops, I posted on wrong bug reports, ignore the previous post.
006F ; [.213C.0020.0002] # LATIN SMALL LETTER O 0366 ; [.213C.0020.0004] # COMBINING LATIN SMALL LETTER O 1D49 ; [.2007.0020.0014] # MODIFIER LETTER SMALL E 0065 ; [.2007.0020.0002] # LATIN SMALL LETTER E https://dencode.com/en/string/unicode-normalization https://unicode.org/reports/tr10/#Main_Algorithm http://www.unicode.org/Public/UCA/13.0.0/allkeys.txt https://opengrok.libreoffice.org/xref/core/sc/source/core/data/global.cxx?r=3ac9f491#1045 https://opengrok.libreoffice.org/xref/core/offapi/com/sun/star/i18n/CollatorOptions.idl?r=944eb990#31 https://opengrok.libreoffice.org/xref/core/i18npool/source/collator/collator_unicode.cxx?r=b122a39c#411 I guess that the cause of bug is, whatever the appropriate implementation is, that these letters only differs in TERTIARY weight, but the option sets the collator's strength to SECONDARY.
The attached bibisect range is meaningless as they appear to be the commit id of the binary, not the source-hash. raal: If you still have the bibisect-41max repo, would you please identify the source-hash of those commits? Seams to be a duplicate of bug 123095, but they are of different chars affected.
0028 ; [*0328.0020.0002] # LEFT PARENTHESIS FF08 ; [*0328.0020.0003] # FULLWIDTH LEFT PARENTHESIS
Reproduced in: Version: 7.3.0.0.alpha0+ / LibreOffice Community Build ID: 94d552f94b427f884c004dba5d4619ecf729d605 CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: gtk3 Locale: en-AU (en_AU.UTF-8); UI: en-US TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2021-06-18_13:30:27 Calc: threaded I think this is serious because, as mentioned in Bug 123095, whatever subset you select in the value list, some values will never show. In the example document, there are 20 rows of data. In the AutoFilter list, there are three values to choose from: tuon, guet, vröude. If ticked, these options respectively show 3, 4 and 4 values: a total of 11.
Confirmed on Windows as well, with slightly different result: The AutoFilter value list shows only the 3 values "guͤt, tuon, vröude", which would filter in 2, 4 and 4 rows respectively. Version: 7.0.6.2 (x64) Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b CPU threads: 8; OS: Windows 10.0 Build 19042; UI render: default; VCL: win Locale: en-AU (en_AU); UI: en-US Calc: threaded and: Version: 7.2.0.0.alpha1+ (x64) / LibreOffice Community Build ID: aa9cb8e14749e7fb7a83b55a2bb095501f731a18 CPU threads: 8; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: en-AU (en_AU); UI: en-US Calc: threaded
Andreas Heinisch committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/2e887e04c0008a4eb6cbf34050b6fa463a33599f tdf#125363, tdf#123095 - Use CaseTransliteration for autofilter It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Andreas Heinisch committed a patch related to this issue. It has been pushed to "libreoffice-7-4": https://git.libreoffice.org/core/commit/1b1ad0e3d5988c5e16dabfaa40252a22dab517b7 tdf#125363, tdf#123095 - Use CaseTransliteration for autofilter It will be available in 7.4.3. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 105314 has been marked as a duplicate of this bug. ***