Bug 143161 - EDITING: Spurious spell-checking red wavy line for Chinese text if the cell content contains manual line-break
Summary: EDITING: Spurious spell-checking red wavy line for Chinese text if the cell c...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
7.1.4.2 release
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
Depends on:
Blocks: Spell-Checking
  Show dependency treegraph
 
Reported: 2021-07-02 15:15 UTC by Ming Hua
Modified: 2022-08-19 18:04 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample file with multiple lines of Chinese text in one cell (9.28 KB, application/vnd.oasis.opendocument.spreadsheet)
2021-07-02 15:15 UTC, Ming Hua
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ming Hua 2021-07-02 15:15:25 UTC
Created attachment 173316 [details]
Sample file with multiple lines of Chinese text in one cell

In 7.1 versions or higher, when a spreadsheet cell contains multiple lines of Chinese text and auto-spellchecking option is on, all the text in the multi-line cell is labelled as wrongly spelled, i.e., the red wavy underline.

Steps to Reproduce:
1. Open attached sample file;
2. Make sure auto-spellchecking is enabled, for example Tools > Automatic Spell Checking... menu entry;
3. Observe that all text in cell B2 is labelled as wrongly spelled, despite that it's all Chinese, and LibreOffice doesn't have any spellchecking dictionary for Chinese (zh-CN or zh-TW).

Expected Result:
No red wavy line for Chinese text, just like the single line text in cell B1.

Additional Information:

I.
Reproduced with both 7.1.4 and 7.2 Beta1:
Version: 7.1.4.2 (x64) / LibreOffice Community
Build ID: a529a4fab45b75fefc5b6226684193eb000654f6
CPU threads: 2; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: en-US
Calc: threaded
and
Version: 7.2.0.0.beta1 (x64) / LibreOffice Community
Build ID: c6974f7afec4cd5195617ae48c6ef9aacfe85ddd
CPU threads: 2; OS: Windows 10.0 Build 19041; UI render: Skia/Raster; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: threaded

But not reproduced (neither B1 or B2 has red wavy underline) with 7.0.6:
Version: 7.0.6.2 (x64)
Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b
CPU threads: 2; OS: Windows 10.0 Build 19041; UI render: default; VCL: win
Locale: zh-CN (zh_CN); UI: en-US
Calc: threaded

Therefore tagged as regression.

II.
Also note that when cell B1 or B2 is focused, the status bar show their language status as "English (USA)", which as I understand is related to spellchecking.  This is despite that I've explicitly set their language as "Chinese (simplified)" (zh-CN) in the Format > Cells... dialog.  The bug also existed when the cells had "Default" (which is just zh-CN on my system) language setting before I explicitly set them.

However, in 7.0.6 both cells also show "English (USA)" language in status bar, yet it doesn't seem to affect spellchecking result.

III.
Another curiosity is that when the multi-line B2 cell is selected, the font selection box on the toolbar is empty, as if the selected cell contains both English and Chinese text.  For comparison, when B1 cell is selected, the font selection box shows the CJK font, "思源黑体" (Chinese name for Noto Sans CJK/Source Sans Han) in my case.

This is also true for 7.0.6, though.
Comment 1 Ming Hua 2021-07-24 04:19:00 UTC
Hi Kevin, I think you may be interested.  Would you please have a look and help reproducing and bibisecting this one?
Comment 2 Kevin Suo 2021-07-24 08:03:12 UTC
Yes, I can reproduce on master.

Currently Simplified Chinese, Tranditional Chinese and Japanese do not have spell check dictionaries (and I do not expect they will have in the future because in my opinion the current Hunspell as used by LibreOffice does not support those languages which do not have a "space" between words).

The code in 
https://opengrok.libreoffice.org/xref/core/sc/source/ui/view/spellcheckcontext.cxx?r=eb6819e7#298
states that "For spell-checking, we currently only use the primary language; not CJK nor CTL.", but I do not see any code block below excluded CJK from the spellcheck.

I am doing bibisecting.
Comment 3 Kevin Suo 2021-07-24 14:35:10 UTC
Bibisected to range: 40fa3a61ac7dbe2ba73b5ee71bb85cc3bb4a27af..8dcbbea3802670004c3e78a1ff1ec56b23df674c

whereas bdd149b1ff3d43b94cadc0d43365100c287c7639 is the only commit which is related to this issue:

'''
author	Dennis Francis	2020-10-04 12:47:46 +0530
committer	Noel Grandin 2020-10-28 08:39:25 +0100
commit bdd149b1ff3d43b94cadc0d43365100c287c7639

Improve spell checking performance and impl. in several ways:
* do synchronous spell checking, avoiding an idle handler
* avoid continuous invalidations caused per-cell by spell-checking
* cache spell-checking information for a given SharedString to
avoid repeated checking of frequently recurring strings.
'''

Adding Dennis Francis to cc list, would you please take a look? Thanks.
Comment 4 Kevin Suo 2021-07-24 14:45:33 UTC
Email seems not been sent to Dennis Francis. Could someone remind him on IRC.
Comment 5 Ming Hua 2021-08-02 16:34:13 UTC
Thanks for the bibisection work, Kevin.

(In reply to Kevin Suo from comment #4)
> Email seems not been sent to Dennis Francis. Could someone remind him on IRC.
I've added a comment on Gerrit at https://gerrit.libreoffice.org/c/core/+/104705, and if that still doesn't work, I'll try raising the issue on IRC.
Comment 6 Michael Meeks 2022-07-13 16:19:20 UTC
Ah - this patch has known problems which are fixed by:

commit dd25fd6bf9b9637d4f1efcfcc642efa4be7f62b1
Author: Szymon Kłos <szymon.klos@collabora.com>
Date:   Wed Mar 23 13:02:29 2022 +0100

    Use correct language for spellchecking in calc
    
    Fixes the problem of not applied spellchecking language
    change in calc.
    1. Open spreadsheet with German text but with English UI language
    2. Change spellchecking language to German
    result: no difference
    expected: spellchecking should be performed and mark words correctly
    
    Visible in both LOK and desktop.
    
    Regression introduced in:
    commit bdd149b1ff3d43b94cadc0d43365100c287c7639
    Author: Dennis Francis <dennis.francis@collabora.com>
    Date:   Sun Oct 4 12:47:46 2020 +0530
    
    Improve spell checking performance and impl. in several ways:

I wonder if you bisected past this somehow and the problem is elsewhere? and/or perhaps this patch is not included in your test set ?

Would be worth eliminating that first.

Thanks !
Comment 7 Ming Hua 2022-08-19 18:04:05 UTC
(In reply to Michael Meeks from comment #6)
> Ah - this patch has known problems which are fixed by:
> 
> commit dd25fd6bf9b9637d4f1efcfcc642efa4be7f62b1
> Author: Szymon Kłos <szymon.klos@collabora.com>
> Date:   Wed Mar 23 13:02:29 2022 +0100
I can not do bibisection, but I tested with version 7.4.0 and can confirm the original reported problem is no longer reproducible with

Version: 7.4.0.3 (x64) / LibreOffice Community
Build ID: f85e47c08ddd19c015c0114a68350214f7066f5a
CPU threads: 12; OS: Windows 10.0 Build 22000; UI render: Skia/Raster; VCL: win
Locale: en-US (zh_CN); UI: zh-CN
Calc: CL

Resolved as FIXED.  Not sure if I should change more fields, or this should be WORKSFORME instead.

> I wonder if you bisected past this somehow and the problem is elsewhere?
> and/or perhaps this patch is not included in your test set ?
BTW the bibisection obviously didn't include this patch, as it's from March 2022 and the bug reporting and bibisection was done in July 2021.