Bug 158127 - INDEX should use en dash (not hyphen) for number ranges
Summary: INDEX should use en dash (not hyphen) for number ranges
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyMedium, easyHack, skillCpp
Depends on:
Blocks: TableofContents-Indexes Authors
  Show dependency treegraph
 
Reported: 2023-11-09 10:16 UTC by R. Green
Modified: 2024-01-31 09:09 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description R. Green 2023-11-09 10:16:02 UTC
Version: 7.5.4.2 (X86_64) / LibreOffice Community
Build ID: 36ccfdc35048b057fd9854c757a8b67ec53977b6
CPU threads: 2; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-GB (en_GB.UTF-8); UI: en-GB
Calc: threaded

Number ranges are ALWAYS written by using dashes, e.g. 23–29, NOT hyphens (i.e. NOT 
23-29).

Unfortunately, indexes are generated in LO Writer using hyphens rather than the correct em dashes.

So, hyphens in index number ranges need to be replaced with em dashes.
Comment 1 R. Green 2023-11-24 09:51:52 UTC
Big oops! That should have been EN (repeat EN) dashes NOT em dashes.
Comment 2 Dieter 2023-11-26 12:32:52 UTC
As far as I can see, "ALWAYS" is not true. Wikipedia says for example APA-Stiyle uses en-dash, while AMA-Style uses hyphen: https://en.wikipedia.org/wiki/Dash

So perhaps there should be an option in index dialog. The option "Combine with -" is too vague. To have the options "Combine with hyphen" and "Combine with en-dash" would be an enhancement.

cc: Design-Team
Comment 3 Heiko Tietze 2023-11-27 11:08:57 UTC
Quick and dirty solution would be to change "aNumStr += "-";" in sw/source/core/doc/doctxm.cxx. But I like the idea with the option, which should be available in the ToC dialog offered as dropdown list (cannot think of another list separator than dashes) instead of "combine with -".

I wonder if the file format has any restriction and what MSO makes out of those documents. If I manually replace the dash it's read in both Writer and MSO correctly (of course replaced on update).
Comment 4 V Stuart Foote 2023-11-27 13:16:01 UTC
Other facet is localization. The TOC/Index generator (core/tox and header)  seem to have additional TOC/Index structure for CJK and CTL nodes. 

Rather than just the appended U+002D HYPHEN-MINUS as U+2013 EN DASH what could other locales require?
Comment 5 Heiko Tietze 2023-11-27 13:22:15 UTC
(In reply to V Stuart Foote from comment #4)
> Rather than just the appended U+002D HYPHEN-MINUS as U+2013 EN DASH what
> could other locales require?

Wikipedia lists four types: En dash, Em dash, Horizontal bar, Figure dash, plus the U+002D hyphen makes it five. I can also imagine running text <1> "to" <2" (localized, of course).
Comment 6 Heiko Tietze 2024-01-31 09:09:20 UTC
No further input, let's implement.