Version: 7.5.4.2 (X86_64) / LibreOffice Community Build ID: 36ccfdc35048b057fd9854c757a8b67ec53977b6 CPU threads: 2; OS: Linux 5.4; UI render: default; VCL: gtk3 Locale: en-GB (en_GB.UTF-8); UI: en-GB Calc: threaded Number ranges are ALWAYS written by using dashes, e.g. 23–29, NOT hyphens (i.e. NOT 23-29). Unfortunately, indexes are generated in LO Writer using hyphens rather than the correct em dashes. So, hyphens in index number ranges need to be replaced with em dashes.
Big oops! That should have been EN (repeat EN) dashes NOT em dashes.
As far as I can see, "ALWAYS" is not true. Wikipedia says for example APA-Stiyle uses en-dash, while AMA-Style uses hyphen: https://en.wikipedia.org/wiki/Dash So perhaps there should be an option in index dialog. The option "Combine with -" is too vague. To have the options "Combine with hyphen" and "Combine with en-dash" would be an enhancement. cc: Design-Team
Quick and dirty solution would be to change "aNumStr += "-";" in sw/source/core/doc/doctxm.cxx. But I like the idea with the option, which should be available in the ToC dialog offered as dropdown list (cannot think of another list separator than dashes) instead of "combine with -". I wonder if the file format has any restriction and what MSO makes out of those documents. If I manually replace the dash it's read in both Writer and MSO correctly (of course replaced on update).
Other facet is localization. The TOC/Index generator (core/tox and header) seem to have additional TOC/Index structure for CJK and CTL nodes. Rather than just the appended U+002D HYPHEN-MINUS as U+2013 EN DASH what could other locales require?
(In reply to V Stuart Foote from comment #4) > Rather than just the appended U+002D HYPHEN-MINUS as U+2013 EN DASH what > could other locales require? Wikipedia lists four types: En dash, Em dash, Horizontal bar, Figure dash, plus the U+002D hyphen makes it five. I can also imagine running text <1> "to" <2" (localized, of course).
No further input, let's implement.
Working on implementing this.
Awesome, thanks Benjamin. Can't wait to see this added. :) - - - On Comment #3 and Comment #5: In this case, between number ranges, all we need is the: - - = U+002D = HYPHEN-MINUS --- This is the one on your keyboard. - – = U+2013 = EN DASH --- This is the typographically correct choice. - - - Technical Note on Dashes: There are quite a few other "dash-like" characters in Unicode. But all of those aren't used in this specific case... and/or come with serious side effects, like: - Missing in many fonts. - Broken Text-to-Speech. I've written about this extensively over the years. For example: - https://www.reddit.com/r/libreoffice/comments/wxp7ps/make_it_look_beautiful/ilwpn34/ --- See my "Tip #5: Use the Proper Dashes". --- This covers the most common 3/4 types. - https://www.reddit.com/r/PubTips/comments/lvfad3/pubq_quick_question_about_the_em_dash/gpqmen9/ --- "Dash/Hyphen Basics" --- More proper use-cases. - https://www.reddit.com/r/writing/comments/9q1jzi/punctuation_is_important_too/e88105a/?context=3 --- Covering some Text-to-Speech issues. - https://www.mobileread.com/forums/showthread.php?p=3952918#post3952918 --- Covering HORIZONTAL BAR / "quotation dash" / U+2015. --- Some languages use this in the beginning of dialogue/QUOTATIONS, not number ranges. - https://www.reddit.com/r/libreoffice/comments/1jk0qa5/how_to_replace_with_em_dash/mk15yqw/ --- Covering dozens of extremely obscure "symbols that look like lines".