Bug 153387 - UX: Sorting string not following Unicode / ASCII order by default
Summary: UX: Sorting string not following Unicode / ASCII order by default
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Sorting
  Show dependency treegraph
 
Reported: 2023-02-05 07:25 UTC by Franklin Weng
Modified: 2023-04-06 16:26 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Demo ods file for data sorting (8.59 KB, application/vnd.oasis.opendocument.spreadsheet)
2023-02-05 07:25 UTC, Franklin Weng
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Franklin Weng 2023-02-05 07:25:36 UTC
Description:
In the default behavior of data sorting in Calc (Writer as well, see reference questions), the sorting algorithm is not following Unicode/ASCII code, which I think it's an intuitive and reasonable default behavior.  If it is not, what kind of behavior does it follow?  I didn't find any document explaining the default sorting algorithm or behavior.  Language/locale settings?  Opening with en_US.UTF-8 locale got the same result.

You can visit the reference 2. I'll attach the file here later.

It may not be a bug, but a user experience issue.

Reference: 
1. https://ask.libreoffice.org/t/calc/86667
2. https://ask.libreoffice.org/t/calc-data-sorting-not-following-ascii-unicode-order/87503
3. https://ask.libreoffice.org/t/writer-not-sort-following-ascii-code/84334

Steps to Reproduce:
1. Open the attached file
2. Data -> Sort Column B as ascending
3. The result does follow ASCII/Unicode code

Actual Results:
The result does follow ASCII/Unicode code

Expected Results:
- Following ASCII/Unicode is a reasonable default behavior IMO
- If not, any document describing the default sorting algorithm?
- If not, any option to force following ASCII/Unicode order?  (Filling every character in Tool - Options - Calc - Sort List is NOT a user-friendly solution IMO)


Reproducible: Always


User Profile Reset: No

Additional Info:
Earliest version I tested: 版本 3.6.7.2 (組建 ID:e183d5b)

Should be inherited from OO.o I think.
Comment 1 Franklin Weng 2023-02-05 07:25:58 UTC
Created attachment 185127 [details]
Demo ods file for data sorting
Comment 2 Franklin Weng 2023-02-05 08:12:48 UTC
At first I thought following the ASCII/Unicode order is quite reasonable default behavior for sorting string data. But… on second thought, our locale is zh_TW.UTF-8, and the unicode order for Chinese numbers does not really follow the number order, so maybe it is really a language issue and can be costumed by the Sort List. I just need a reasonable explanation about the default sorting algorithm, and a *user friendly* way to set the sorting algorithm to follow the unicode/ASCII order.
Comment 3 Eike Rathke 2023-02-09 20:39:07 UTC
For completeness, just adding what I gathered at Ask
https://ask.libreoffice.org/t/calc-data-sorting-not-following-ascii-unicode-order/87503/12

Sorting order is defined by the Unicode Collation Algorithm (UCA) https://unicode.org/reports/tr10/ and LibreOffice uses the ICU https://icu.unicode.org/ implementation. Collation details may even depend on locale. See ICU Collation Demo https://icu4c-demos.unicode.org/icu-bin/collation.html .

For zh-TW there is some specific tailoring for the different algorithms, see i18npool/source/collator/data/
https://opengrok.libreoffice.org/search?project=core&full=&defs=&refs=&path=i18npool%2Fsource%2Fcollator%2Fdata%2F&hist=&type=&xrd=&nn=1&si=path&si=path
the zh_TW_*.txt files.