Bug 136182 - LibreOffice does not properly convert -- to em-dash by default
Summary: LibreOffice does not properly convert -- to em-dash by default
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.4.0.3 release
Hardware: x86-64 (AMD64) Windows (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsUXEval
Depends on:
Blocks: AutoCorrect-Complete
  Show dependency treegraph
 
Reported: 2020-08-27 14:39 UTC by brightwanderer
Modified: 2020-12-11 13:51 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description brightwanderer 2020-08-27 14:39:15 UTC
Description:
When reporting interrupted speech (either in fictional dialogue or transcribed conversations) it is standard to use em-dash as follows:

"You can't—"

"I assure you I can."

In this context, by default LibreOffice doesn't convert -- to em-dash, because the default replacement table entry expects a space before the dashes. To get it to work, the entry .*-- or .*--.* has to be added manually to the replacement table.

Steps to Reproduce:
1. Type any line of dialogue, with or without quotation marks, that ends with a trailing em-dash to indicate interruption, e.g. "What--?" or What--? or "So--" or So--


Actual Results:
-- did not convert to em-dash

Expected Results:
-- should convert to em-dash. The user should not have to manually add the entry with wildcards to the replacement table.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
Version: 7.0.0.3 (x64)
Build ID: 8061b3e9204bef6b321a21033174034a5e2ea88e
CPU threads: 12; OS: Windows 10.0 Build 18363; UI render: Skia/Vulkan; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: CL
Comment 1 BogdanB 2020-09-13 16:47:37 UTC
It is working like this:
https://help.libreoffice.org/7.0/ro/text/shared/01/06040100.html?&DbPAR=WRITER&System=UNIX

You need letter minus minus letter -> and this will work

For example, type this:
You can't--I assure you -> it will work
Comment 2 brightwanderer 2020-09-14 11:27:37 UTC
Yes, it is working for the use-case where the em-dash is between two letters with no spacing (which is the standard for AP style).

However, that's not the only use-case for em-dash. The one I've mentioned in my original bug report (indicating an interruption of dialogue or narrative) is a also a standard use of em-dash, predominantly in fiction or transcribed speech (e.g. Chicago Manual of Style). Other word processors will always automatically convert double hyphens to em-dash, regardless of whether they are preceded/followed by a letter or numeral, an end quotation mark or other punctuation, a space, or a new line.

So what I'm saying is not that it's a bug in the sense of the code performing incorrectly, it's a bug in that the defaults are only set up to account for one way of using em-dash in English, and this is unusual enough compared to e.g. Google Docs or Scrivener to stand out, particularly to anyone who writes fiction or does transcription work.
Comment 3 BogdanB 2020-09-14 11:51:48 UTC
Ok, UX Team will analize your bug very soon.
Comment 4 V Stuart Foote 2020-09-17 20:32:55 UTC
No, we have to watch excessive changes to the auto-correct table especially regex globals like this.

Adding a default auto-correction of ".*--.*" to U+2013 (–) would disrupt the far more common "-->" correction to U+2192 (→)

Users are free to make their own adjustments to the auto-correct table.

IMHO => WF
Comment 5 lomacar 2020-09-18 19:41:37 UTC
What you are showing is an en-dash, normally em-dash is used for such things. I think a triple dash --- could be automatically converted to em-dash, but as already stated, you can easily set that up yourself. The current behaviour in Writer is to convert -- to em-dash between letters and to en-dash between numbers.
Comment 6 V Stuart Foote 2020-09-18 21:02:20 UTC
(In reply to lomacar from comment #5)
> What you are showing is an en-dash, normally em-dash is used for such
> things. I think a triple dash --- could be automatically converted to
> em-dash, but as already stated, you can easily set that up yourself. The
> current behaviour in Writer is to convert -- to em-dash between letters and
> to en-dash between numbers.

Not sure that is correct, rather there is no difference in the "--" replacement as 'en-dash' (U+2013) when used between characters or numbers. I just verified that is true with current master in Writer, Calc and Impress.

Looking in source, can't find a use of the U+2014 'em-dash' for auto-correction of two dashes. We do pick up em-dash in a few locales for edit-shell. 

And otherwise provide an 'emoji' style "::"  auto-correction, i.e. entry o ":---:"  becomes '—' (U+2014)
Comment 7 Heiko Tietze 2020-12-11 13:51:16 UTC
(In reply to V Stuart Foote from comment #4)
> IMHO => WF

Agreed. Have also seen many reports where auto replacement was considered a bug, so we should be careful with .* in general.