A serious limitation of alphabetical indexes is that there is no option to ignore diacritics. This means that words beginning with A, Ā, À, Á etc. are indexed under DIFFERENT alphabetical delimiters! In fact, they should all be treated as starting with the letter "A", and placed under the same delimiter. See also https://en.wikipedia.org/wiki/Diacritic#Alphabetization_or_collation (wikipedia). (This issue was raised in Bug 131315, but, on reflection, needs to be treated as a separate issue to prevent it getting "lost in the post").
Created attachment 169913 [details] Writer file showing indexing of diacriticals The opening post was a bit misleading. Common diacriticals, such as acute and grave, seem to be indexed correctly. Other diacriticals may cause the word to be indexed under a new alphabetical heading. This is shown in the example Writer file (attached).
not so sure...but possibly locale-dependent? https://opengrok.libreoffice.org/xref/core/i18npool/source/indexentry/indexentrysupplier.cxx?r=042033f1#109 https://opengrok.libreoffice.org/search?full=getIndexKey&path=%22%2Fcore%2Fi18npool%2Fsource%2Findexentry%2F%22&project=core
Perhaps a tick/check box could be added to "Edit Index > Type" with the following wording: "Ignore diacritic at start of entry". This would allow a diacritic letter at the start of an entry to be sequenced under under the same alphabetical delimiter as if the start letter were non-diacritcal.
Version: 7.1.5.2 / LibreOffice Community Build ID: 85f04e9f809797b8199d13c421bd8a2b025d52b5 CPU threads: 2; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: en-GB (en_GB.UTF-8); UI: en-GB Calc: threaded So, to summarize the problem. In English-language indexes at least, common diacriticals, such as acute, grave, circumflex, ARE correctly indexed under the same alphabetical delimiter. Others, such as a macron (bar above letter) are not. SUGGESTED BEHAVIOUR Alphabetization needs to be corrected so that ALL diacritical variations of a letter are filed under the SAME ALPHABETICAL DELIMITER.
And should there be an option (in the edit index dialog) to ignore dicritics altogether when it comes to alphabetizing?
Reproduced by inserting a new alphabetical index in the document with English as the index language (as opposed to, say, Hebrew). Already seen in last36onmaster commit tag in linux-43all repo. The oldest commit crashes when opening the index dialog, so I can test. Arch Linux 64-bit, X11 Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: e32dfaf15563372ffae6e0da53998e20068ebf81 CPU threads: 8; OS: Linux 6.2; UI render: default; VCL: kf5 (cairo+xcb) Locale: fi-FI (fi_FI.UTF-8); UI: en-US Calc: threaded Built on 1 March 2023