Bug 154766 - management of ellipsis variants
Summary: management of ellipsis variants
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsUXEval
Depends on:
Blocks:
 
Reported: 2023-04-11 22:01 UTC by toddwarner
Modified: 2023-04-17 18:19 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description toddwarner 2023-04-11 22:01:34 UTC
An ellipsis appears in two forms in papers, publications, and manuscripts.

1. …
2. . . .

Each is treated as a single word and is not breakable. Or, at least, they should be treated as such. Alas, I don't know of a single wordprocessor on the planet that manages the second form well, if at all.

The first form is embraced by the APA Style, for example, because that is the style guide used to develop news copy (where space is at a premium). The three dots spaced out is embraced by everyone else: MLA Style Guide, Chicago Manual of Style, etc.

The second form is challenging. To duplicate it, writers often just add spaces between the periods. But then, if the ellipsis needs to be wordwrapped, the writer may find ellipses split onto two lines. It's a single unbreakable word, remember. And so, what many writers do instead is to insert non-breaking spaces in between.

This is not ideal. A typical novel-length manuscript to be submitted for publication is expected to be formatted following the Chicago Manual of Style. Now, one could do a document-wide cut-and-paste of the APA-styled ellipsis, but again . . . not ideal. And that combination of characters, again, is not treated as a single word (word counts, etc.)

I have also reached out to the Unicode community. They said they _will not_ add the spaced-out ellipsis into the character set because they pick one variant of anything, and that is that.

And so here we are. For most publications and manuscripts, we are forced to use an old-school work-around.

- - -

So, why this bugzilla? Because a LibreOffice community member suggested that I submit the request after I griped about it on Mastodon. LibreOffice could lead the charge on plugging this little hole in usability. I mean, MS hasn't tackled it. The difficult bit is determining what the correct behavior to enable support of the most common form of an ellipsis is. I couldn't really tell you. I look to you gurus of wordprocessing usability. It would also have to be somehow portable when exported.

I don't know what the answer is, but maybe this will strike up a discussion. Beter yet, maybe someone can leverage the Unicode community to see the light and add the character. :)
Comment 1 V Stuart Foote 2023-04-14 15:19:09 UTC
LibreOffice provides an Autocorrect replacement for entry of "..." with the Unicode U+2026 HORIZONTAL ELLIPSIS glyph that will follow the font handling.

You can delete that and assign a 'New' replacement for ":...:" to enter any ellipsis format of your choice, perhaps using NBSP Unicode. 
E.g. "U+002eU+00a0U+002eU+00a0U+002eU+00a0" (convert with <Alt>+x and paste)
or "U+002dU+00a0U+002dU+00a0U+002dU+00a0"

Your alternative to use a single glyph would be in finding an OpenType font offering the U+2026 horizontal ellipsis with stylistic alternate. LibreOffice supports use of stylistic alternates as well from the Character dialog.


IMHO => WFM
Comment 2 V Stuart Foote 2023-04-14 15:27:41 UTC
Oh, and in addition to the ":...:" emoji notation, there is also the ".*..." entry that also replaces the periods as single U+2026 glyph. That might be the better one to tweak.
Comment 3 V Stuart Foote 2023-04-14 15:32:46 UTC
(In reply to V Stuart Foote from comment #2)
> Oh, and in addition to the ":...:" emoji notation, there is also the ".*..."
> entry that also replaces the periods as single U+2026 glyph. That might be
> the better one to tweak.

So the emoji notation is actually ":.:", not ":...:", and the general replacement is "..." and either can be modified to render the horizontal ellipsis needed.
Comment 4 toddwarner 2023-04-14 16:19:40 UTC
Those workarounds are certainly helpful and seem (without me testing them) more robust than the usual workarounds everyone already does. But they are still workarounds. They are techniques that today represent what we can all agree to call the _Thoughts and Prayers_ approach. I.e., an approach that is useful only because a real solution doesn't yet exist. I kid . . . a little. :)

But this Bugzilla is not an RFE for improved workaround documentation (though that would be nice as well). It's an RFE to explore a solution for normal, non-technical, people. Something more seamless. I.e., hide all that behind some better UI. ? Ensure the grammar checker sees an ellipsis and not three dots? Etc. etc.

An aside, because it is only worth an aside: The glyph. Yes, I have contacted Monotype to add this widely used variant of the ellipsis to the Times New Roman typeface (a manuscript submission requirement for many/most publishing houses). Still, I am not holding my breath there, either. It should be noted that publishing houses are well aware of this technical wart, so often, they are forgiving if you slap in APA ellipses throughout your document instead.

So far, I have reached out to you guys, the Unicode community (hard "No" from them: "Against Policy" . . . Wuh?, I mean we have different dashes. Grr.), and Monotype ("Sorry, we are too busy cashing OS vendor checks to actually work on anything."). I haven't reached out to Adobe (Courier font—still expected for screenplays, etc.)

Look, I know this is not a clear, simple ask, but I figured some clever people are hacking away on LibreOffice who may see this as an interesting challenge. But, yeah, a great workaround doc would be awesome as well. ;) But what an interesting feature that would make LibreOffice that much more interesting to the writing community. Plus, some wordprocessor vendor has to be first. Right?

Cheers. -t

For further reading, I just stumbled on an article on this topic:
https://cmosshoptalk.com/2019/07/30/dot-dot-dot-a-closer-look-at-the-ellipsis/
Comment 5 V Stuart Foote 2023-04-14 16:55:43 UTC
Not seeing an issue. U+2026 with OpenType stylistic alternative for the opened spacing of the ellipsis is already supported--UI is the Character dialog. 

Likewise supported are compound character runs with NBSP (U+00a0) or NNBSP (U+202f)--UI there is via the Autocorrection dialog (the emoji :: notation, or the keyboard sequence .*...).

If Unicode folks don't want to implement an SMP Unicode for an alternate glyph to U+2026 keep after them.

Not "workarounds", we are already doing enough.

Sorry.
Comment 6 toddwarner 2023-04-14 18:54:39 UTC
Note: This bugzilla is not asking how to insert a U+2026. That's not the issue. It's that the U+2026 is an APA-styled ellipsis (three dots, no spacing between). Most other styles require and ellipsis formatted as three dots, spaced between. I hope that wasn't confused. Anyway . . .

Can someone comment who has written for traditional publication, please? Maybe that is no one. Dunno.

What you shared is 100% a workaround. Well, except for this, which doesn't even rise to that level: "Not seeing an issue. U+2026 with OpenType stylistic alternative for the opened spacing of the ellipsis is already supported--UI is the Character dialog." That only works if you use a typeface that will support it (none do that I know of, and not Times New Roman) and only if you know how to trigger it. 99.999% of the users do not.

I have long had LibreOffice autocorrect to .[NBSP].[NBSP].. That is definitely a workaround and not an end solution. That is also what everyone largely does, or they just use three spaced-out dots and individually deal with the resulting word wrap and formatting issues. This is fine-ish. But it is a brute-force, non-ideal solution. Plus, it breaks the grammar-checker and other things since LibreOffice doesn't see an ellipsis.

Whatever. My RFE was a stab in the dark. I expected this response pretty much since no one from the word-processing world has resolved it. It'll take the Unicode community to make it a reality, I am sure.

In the meantime, I hope I triggered someone to think about it.

Cheers. -t

P.S. The two dashes '--' that you used above instead of an emdash '—' is also no longer a "workaround" if you are publishing professionally. An actual emdash is expected. Thank goodness that's available. Maybe someday the Chicago/MLA/etc-style ellipsis will be as well.
Comment 7 V Stuart Foote 2023-04-14 19:05:51 UTC
(In reply to toddwarner from comment #6)
> Note: This bugzilla is not asking how to insert a U+2026. That's not the
> issue. It's that the U+2026 is an APA-styled ellipsis (three dots, no
> spacing between). Most other styles require and ellipsis formatted as three
> dots, spaced between. I hope that wasn't confused. Anyway . . .
> 

Hmm, APA calls for 3 dot, or 4 dot when ending a sentence. So technically using the Unicode U+2026 HORIZONTAL ELLIPSIS should not be used at all ;-) Just remove, or modify, the autocorrect substitution to meet your preference.

While other CSL that call for spaces, allow either NBSP or NNBSP between the dots.

Point is LibreOffice already support all those variations, and that you've made no case nor justified a work flow that is not supported.

IMHO => NAB
Comment 8 toddwarner 2023-04-14 20:02:48 UTC
(Note, I didn't file a bug. I filed an RFE.) Anyway . . .

I am just adding this bit to clear things up though I see this as being dismissed, but someone may find it educational someday.

1. I keep typoing. Grr. AP-style (Associated Press) is what I meant. That's the style guide that drove the 3-dots, no-spacing-between formatting, even though the other is more common. Coincidentally, APA-style (largely only used by the shrink community) also adopted that style. Regardless . . .

2. I see how you're fundamentally not understanding this now. You see an ellipsis as three dots in a row. Which it isn't. For example, at the end of a sentence your eyes may see 4 dots, but that's not what is happening. It's an ellipsis and a period.

An ellipsis is treated as a word. An unbreakable, single-character word. Grammar and spell checkers and word counters see it as a word. A word that has a meaning: an ellipsis.

As I said, I was just hoping there was a better workaround than changing up the auto-correct to dish out a .[NBSP].[NBSP]. but perhaps I was hoping for too much. Maybe the Unicode community will solve this for us.

Have a great weekend, folks. -t
Comment 9 Heiko Tietze 2023-04-17 08:34:29 UTC
Request is twofold, a) ellipsis as one character, b) ellipsis with larger spacing.

Depending on the journal you want to change the type, it must not break with lines, and should be easy to apply. For a) it was suggested to use the auto replacement function (could be the replacement table or even auto text) but I can also imagine to use the special character and/or search/replace function.
In case of b) I'd suggest to use a character style with larger spacing. For example 5pt, and what you type as three consecutive dots will look like dots with spaces.

Does this work for you? Please consider that we have to find general solutions and while it might seem as a workaround for you this should be the approach to make everyone happy.
Comment 10 toddwarner 2023-04-17 12:50:42 UTC
(In reply to Heiko Tietze from comment #9)
> Request is twofold, a) ellipsis as one character, b) ellipsis with larger
> spacing.

Threefold: (c) LibreOffice knows that it is an ellipsis (for the sake of the grammar-checker, word counter, and the like) and not some typo or gibberish.

That's the *actual* solution. But it probably requires action by the Unicode Consortium (by the way, they seem to not have an issue tracker or any easy means of contact — boggle!)

The meta-request is that in lieu of The Correct Solution[tm], the request is to make the most useful and typical workaround (.NBSP.NBSP.) some sort of first-order workflow where changing the autocorrect to .NBSP.NBSP. is the suggested route (I'm not a UI guy, so I am not sure how that would work), and LibreOffice would recognize that pattern for its intent: an ellipsis.

Most folks just find and replace … with .NBSP.NBSP. at the tail end and run with it. This is an okay solution. A "good enough" solution. But man, if LibreOffice *knew* it was an ellipsis . . . that would be a lovely thing.

Thank you for your attention. Cheers. -t
Comment 11 Heiko Tietze 2023-04-17 12:58:03 UTC
(In reply to toddwarner from comment #10)
> Threefold: (c) LibreOffice knows that it is an ellipsis (for the sake of the
> grammar-checker, word counter, and the like) and not some typo or gibberish.
How could a grammar checker know when it... a real ellipsis and when not...?

> The meta-request is that in lieu of The Correct Solution[tm], the request is
> to make the most useful and typical workaround (.NBSP.NBSP.)...
Please read my comment again, no non-breaking space needed for b) and I suggest to use the  special characters dialog for a). Ultimately this is, if you agree, not a bug neither requires any enhancement.
Comment 12 toddwarner 2023-04-17 13:48:14 UTC
(In reply to Heiko Tietze from comment #11)
> (In reply to toddwarner from comment #10)
> > Threefold: (c) LibreOffice knows that it is an ellipsis (for the sake of the
> > grammar-checker, word counter, and the like) and not some typo or gibberish.
> How could a grammar checker know when it... a real ellipsis and when not...?

I'm not a developer. I can't answer that question. I take it from your response that LibreOffice does not also control the grammar checker? I have no idea.

> > The meta-request is that in lieu of The Correct Solution[tm], the request is
> > to make the most useful and typical workaround (.NBSP.NBSP.)...
> Please read my comment again, no non-breaking space needed for b) and I
> suggest to use the  special characters dialog for a). Ultimately this is, if
> you agree, not a bug neither requires any enhancement.

"I'd suggest to use a character style with larger spacing. For example, 5pt"

Manuscript formatting requires 12pt font—for everything. Sometimes publishers allow chapter headers and things to be slightly larger, but that's not typical. So, I am unsure what you mean by 5pt? Shrink the font? Just for that one character? Does LibreOffice support individual-character custom styling? Also, the dots have to match the design of the typeface anyway. Oddly large or small is not acceptable. The standard for manuscript formatting for narrative text is 12pt Times New Roman (sadly, yes, they require proprietary typefaces). For screenplays, it's 12pt Courier. For non-narrative non-fiction manuscripts, they often allow Arial/Helvetica. Some publishers are more flexible, but writers work with multiple publishers, and so their manuscripts skew to the norm.
Comment 13 Heiko Tietze 2023-04-17 15:09:48 UTC
(In reply to toddwarner from comment #12)
> So, I am unsure what you mean by 5pt?

Character spacing is zero by default - no additional space added. If you increase the number, the characters are spaced out by the exact value but the text remains the same. You find the option under Format > Character... > Position.
Comment 14 V Stuart Foote 2023-04-17 17:18:45 UTC
OK, this is stupid.

We provide a single glyph substitution via Autocorrect mechanism (":.:" or ".*...") to replace the Unicode U+2026 as a single glyph.  And would support an OpenType Stylistic alternative for any selected font that provides the wider spaced gylph. 

And, our autocorrect sequences can be customized to users preferences to not assign the U+2026 glyph.

The "discussion" 3 dots, or 3 dots separated by spaces as *depending* on CSL or *prescribed* in some style guide/book is tangential (i.e. enhancement of bug 121945).

Point is we already support any reasonable use case for handling the horizontal ellipsis.

Beyond that yes we could do an edit engine filter to assign some flag to the "..." sequence (when autocorrect is suppressed) to pass to a grammar checking extension, e.g. LightProof. But don't see that as pressing--and an easy => WF.
Comment 15 toddwarner 2023-04-17 17:53:51 UTC
(In reply to V Stuart Foote from comment #14)
> OK, this is stupid.
> 
> We provide a single glyph substitution via Autocorrect mechanism (":.:" or
> ".*...") to replace the Unicode U+2026 as a single glyph.  And would support
> an OpenType Stylistic alternative for any selected font that provides the
> wider spaced gylph. 

Many folks already do something like this (well, those who aren't afraid to dive into those menus), minus the stylistic alternative since that's very typeface specific.

> And, our autocorrect sequences can be customized to user's preferences to not
> assign the U+2026 glyph.

Yup. I do this myself. As do so many others. Alas, it is a workaround.

> The "discussion" 3 dots, or 3 dots separated by spaces as *depending* on CSL
> or *prescribed* in some style guide/book is tangential (i.e. enhancement of
> bug 121945).

That RFE is citation specific which is related, I suppose. And a great ask. But style adherence covers all of the text.

> Point is we already support any reasonable use case for handling the
> horizontal ellipsis.

LibreOffice does not. And I know of no office suite that does. LaTeX does, because, of course. It's a typesetting package. I like LyX, but geez. I'm sure I could produce a .docx (required by publishers, though I do push them to accept .odt) via LyX, but it is LyX.

Fun reading: https://tug.ctan.org/macros/latex/contrib/ellipsis/ellipsis.pdf
 
> Beyond that yes we could do an edit engine filter to assign some flag to the
> "..." sequence (when autocorrect is suppressed) to pass to a grammar
> checking extension, e.g. LightProof. But don't see that as pressing--and an
> easy => WF.

Awesome!
Comment 16 toddwarner 2023-04-17 18:19:18 UTC
(In reply to Heiko Tietze from comment #13)
> (In reply to toddwarner from comment #12)
> > So, I am unsure what you mean by 5pt?
> 
> Character spacing is zero by default - no additional space added. If you
> increase the number, the characters are spaced out by the exact value but
> the text remains the same. You find the option under Format > Character... >
> Position.

Ah. I see what you mean. I should have guessed that's what you meant.

Anyway. Sure. This would be another viable workaround if autocorrect could also apply the character style. This may be a decent option for the find-and-replace folks.

I did some testing, and this spacing idea actually leads to improved word-wrap behavior—closer to what should be done. It's still not an ellipsis, which is the core problem, and when you pass the document to an editor, I suspect this twitchy styling will lead to patchwork formatting. I'm not sure. But it is another tool in the toolbag.

Thank you for your input. I think you grasped the ask. I presume LibreOffice will eventually address this if Unicode adds a character (and typefaces support it) or if someone like Microsoft leadership on this topic. They have not yet. I presume this is a gnarly problem because Unicode doesn't supply the character. TeX supports the ellipsis because it is a typesetting package, but that's not a route for the Everyman.

Thanks again. Mr. Foote killed this, so I suppose everything has been said that has been said.

Cheers. -t