Bug 127319 - AutoCheck/Spellcheck treats superscripted portion as part of the non-superscripted word (but MS Word doesn't)
Summary: AutoCheck/Spellcheck treats superscripted portion as part of the non-superscr...
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsUXEval
Depends on:
Blocks: Spell-Checking
  Show dependency treegraph
 
Reported: 2019-09-03 21:48 UTC by Paul
Modified: 2019-10-01 13:14 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
superscript letters spellcheck problem (19.46 KB, application/vnd.oasis.opendocument.text)
2019-09-24 12:58 UTC, Paul
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul 2019-09-03 21:48:15 UTC
Description:
Currently LO Writer's spellcheck flags words with adjacent superscript footnote anchors, creating a mess of wavy orange lines. What is desired is for superscript to be ignored by spellcheck.

Steps to Reproduce:
1. Type word. 
2. Add footnote adjacent to word.


Actual Results:
Spellcheck throws spelling error.

Expected Results:
No error.


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Dieter 2019-09-23 06:12:53 UTC
I can't confirm this with

Version: 6.3.2.2 (x64)
Build-ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU-Threads: 4; BS: Windows 10.0; UI-Render: GL; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: threaded

Paul, to be certain the reported issue is not related to corruption in the user profile, could you please reset your Libreoffice profile (https://wiki.documentfoundation.org/UserProfile) and re-test?

I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' if the issue is still present
Comment 2 Paul 2019-09-24 00:36:53 UTC
Thank you, Dieter. The profile reset did not help, but in the process I found I had misrepresented the problem. The superscript letters in question are not actual footnote anchors, but mere plain text used as footnote markers. So I've retitled this report accordingly. I don't know how to classify this, enhancement or bug, but I think that superscript letters should be excluded from spellcheck, at least optionally.

Best wishes,
Comment 3 Dieter 2019-09-24 06:02:53 UTC
(In reply to Paul from comment #2)
> The superscript letters in question
> are not actual footnote anchors, but mere plain text used as footnote
> markers.

Paul, please attach a sample document, as this makes it easier to understand what do you mean by "plain text used as footnote markers". And why don't you use footnote anchors? You can easily customize them in Tools > Footnotes

I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Please note that the attachment will be public, remove any sensitive information before attaching it)
Comment 4 Paul 2019-09-24 12:58:24 UTC
Created attachment 154435 [details]
superscript letters spellcheck problem

Ok Dieter, I'm attaching a small .odf file showing the problem.

I usually use the actual footnote function, but this is a file made by others, I think it was in .doc format originally, and for some reason they went with plain text throughout. And it's a large, 1.9MB, file.

Thanks again.
Comment 5 Dieter 2019-09-24 16:40:43 UTC
I opened text in LO 6.3.2.2 and I confirm that spellcheck doesn't ignore superscript characters. But I also don't think, that this is a bug, because you should use footnote anchors for that. Perhaps the author wasn't aware of this possibility.

If you footnote settings in Extras => Footnotes/Endnote Settings => Footnotes => Autonumbering a, b, c you get the desired result.

If you agree, I will close it as RESOLVED NOTABUG. If you don't agree, please give a short reasoning.
Comment 6 Paul 2019-09-24 16:47:27 UTC
I don't agree, because many documents are created in plain text with manual footnotes. This is especially true of books that have been scanned, as often happens with older books off copyright. Most of the time the footnote "anchors" are encased in brackets, so spellcheck is not disturbed by them, but sometimes, as here, that is not the case.

In my view this is a valid, but not the most important concern for LO, so do with it what you will.
Comment 7 Dieter 2019-09-24 16:50:55 UTC
O. K. Let's keep it open for further input from other people.
Comment 8 Franklin Weng 2019-09-25 12:36:05 UTC
Add needsUXEval keyword
Comment 9 Heiko Tietze 2019-09-27 06:54:53 UTC
You want to disable spellchecking on some special character properties, here superscript. I disagree with the example of 1st, 2nd, 3st, 4dh etc. There might be many more reasons for super-/subscript. 
You may have in mind an option "[ ] Don't check superscript" like we have for uppercase words. And still you run into issues with words that partially use superscript within the word, admittedly corner cases. And if you disable the whole word "[ ] Don't check words with superscript" you loose notes, references or whatever is formatted respectively. Users can have a special character style for it.

Footnotes needs to be separated, "plain text" documents are not suited for office suites.
Comment 10 Justin L 2019-09-27 08:44:40 UTC
I notice that auto-spellcheck ignores 6But - so that gives some kind of precedence to ignoring certain parts of a "word". And Word 2003 already treats superscripted text as a separate word.

But superscripting (from a computer perspective) is no different than setting some characters in a word to be a different colour, or bold, or italics etc. I imagine that treating superscripting separately would be tricky and CPU intensive. So I would tend to put the burden on the user to create a well-formed document, and make a separation between what they consider to be non-word-forming characters and the word itself.

For this particular document, one user workaround could be to search for superscripted characters, and insert a ZERO WIDTH NO-BREAK SPACE character in front of it (which can be created by typing U+FEFF and then hitting Alt-x).

Finding these is possible in "Find and Replace". Press the Format button and select Format tab, superscript, and relative font size 80.

Since this is a huge document, here is an advanced find/replace (that needs to be copy/pasted, since it contains the invisible zero-width no-break space(ZWNBS) in it.) You need to enable the "regular expressions" option to use this search.
Find:(?<=[^ ])([^ ])
Replace:$1
[Hint: to ensure that you have the correct REPLACE contents (including the ZWNBS), use the keyboard to select. Start from the end (after $1), hold shift and use the left arrow key to select everything including the colon, and then press the right arrow key once to deselect the colon.]

Spelled out this finds *something* that is not a space or a ZWNBS that follows something else that is not a space or a ZWNBS.
Replace *something* with a ZWNBS and *something*.
Comment 11 Paul 2019-09-27 12:51:49 UTC
>"plain text" documents are not suited for office suites.

Of course, on new documents the user should use the excellent footnote/endnote function. I have it hotkeyed here and use it often. But working with existing older documents should be a valid consideration for modern programs. I use LO as much for that as for up-to-date needs.

>insert a ZERO WIDTH NO-BREAK SPACE character in front of it 

If a formal superscript consideration has too many drawbacks, this is an excellent way to otherwise deal with the problem. I'll save this and use it unless and until something might be done in LO on this issue.

Thanks for the input, guys!
Comment 12 Heiko Tietze 2019-09-27 17:56:31 UTC
(In reply to Paul from comment #11)
> >insert a ZERO WIDTH NO-BREAK SPACE character in front of it 
> Thanks for the input, guys!

So let's close the request as WF.
Comment 13 Paul 2019-09-30 17:18:28 UTC
Actually I'm finding it nigh unto impossible to do an effective search for superscript, which breaks the whole concept that was proposed. If I spec superscript in Find, it also applies a vast array of other, unwanted qualifiers. Alt-Search crashed LO, perhaps because of file size, so no success there either.
Comment 14 Heiko Tietze 2019-10-01 06:47:30 UTC
(In reply to Paul from comment #13)
> Alt-Search crashed LO, perhaps because of file size, so no
> success there either.

This kind of bug is a honey pot for developers. You should report in a way the crash can reproduced.
Comment 15 Justin L 2019-10-01 08:17:07 UTC
(In reply to Justin L from comment #10)
> I imagine that treating superscripting separately would be
> tricky and CPU intensive.
Perhaps not super CPU intensive, since it seems like there already is a loop looking for hints. For a mildly similar search adjustment, see fixed bug 101936 sw: ignore comment anchors during search
Comment 16 Paul 2019-10-01 13:14:58 UTC
Thanks guys. I tried again later. Previously I had another, small, document open in LO, perhaps that fact contributed to the crash. This time, with only the large document open, I was able to execute Alt-Search successfully. I settled for inserting a regular space rather than a ZWNB one, but I'll work on that next time. Alt-Search has the drawback of taking a long time to execute, but it works well.