Bug 50786 - EDITING: sort or natural sort of text should ignore spaces
Summary: EDITING: sort or natural sort of text should ignore spaces
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
3.5.3 release
Hardware: All All
: low enhancement
Assignee: Not Assigned
Depends on:
Blocks: Sorting
  Show dependency treegraph
Reported: 2012-06-06 06:21 UTC by Bert Van Vreckem
Modified: 2017-08-13 09:35 UTC (History)
0 users

See Also:
Crash report or crash signature:


Note You need to log in before you can comment on or make changes to this bug.
Description Bert Van Vreckem 2012-06-06 06:21:44 UTC
The sort algorithm (apparently) sorts on ascii values of the characters in text strings. Consequently, a space character is also considered to be a letter. However, a space should not have an influence on the sorting order and should be ignored.

If it is not possible to implement this in the standard sort function (as it would change the behaviour of a feature that is used quite a lot), then I hope it can at least be implemented in "natural sort".

An example: Flemish surnames often start with "De" or "Van de" and these names can be written either with or without spaces. LibreOffice Spreadsheet sorts these names incorrectly:


1stname	Surname
Bart	Debacker
Peter	De Decker
Mark	Demol
Rudy	De Wilde
Wim	Dewilde


1stname	Surname
Peter	De Decker
Rudy	De Wilde
Bart	Debacker
Mark	Demol
Wim	Dewilde


As a workaround, you add a column calculated from the the one to be sorted with spaces removed (e.g. =SUBSTITUTE(B2," ","") ). However, I find this rather kludgy.
Comment 1 Heiko Tietze 2015-10-26 09:46:40 UTC
Some more additions:

In #95192 the sorting of street numbers is discussed. To generalize I'd ask how to sort with multiple numbers. My suggestion is to use all until the first whitespace (or rather the first stop character)

Text 1/15
Text 2/11
Text 3/9
Text 4/3

With respect to the name there might be also the need to sort independent from the name affix. The German rules define that "von der Lippe" has to be collated at L - affixes "von" plus "der" are not part of the sort order unlike the Flemish "de". So why don't we trust the system settings of collation here? Guess there is something similar to Linux' LC_COLLATE under Windows and Mac Os. On the other hand, how should LibO Calc knows what kind of text is in the cells? Adding more switches makes the ugly dialog even worse. So we should rather let the user calculate per SUBSITUTE() or RIGHT() in order to sort properly.