Bug 117037 - localc does not understand unicode minus
Summary: localc does not understand unicode minus
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.4.5.1 release
Hardware: All All
: medium normal
Assignee: Andreas Heinisch
URL:
Whiteboard: target:7.6.0 target:7.5.3
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-16 12:22 UTC by mwelinder
Modified: 2023-04-03 19:04 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mwelinder 2018-04-16 12:22:08 UTC
Description:




Steps to Reproduce:
1. In lowriter, enter "−1000".  (That is not a hyphen, but a unicode minus,
0x2212).
2. Select, Copy, and then paste into A1 in localc.
3. In A2, enter "=A1+1".
4. In A3, enter "=ISNUMBER(A1)"


Actual Results:  
Observed: A1 shows "−1000", right-justified as-if it was recognized
as a number.  A2 shows an error.  A3 shows FALSE.


Expected Results:
Expected: B1 shows "-999" or "−999".  A3 shows TRUE.

Further expected: anything not recognized as a number should be shown
left-justified after paste.



Reproducible: Always


User Profile Reset: No



Additional Info:
Stock OpenSuSE 43.3



User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
Comment 1 Mike Kaganski 2018-04-16 13:39:18 UTC
(In reply to mwelinder from comment #0)
> Observed: A1 shows "−1000", right-justified as-if it was recognized
> as a number.

Cannot reproduce *right-justified* part with Version: 6.1.0.0.alpha0+ (x64)
Build ID: ba69036c8e889237da4bb312d7c5c94066abbfd3
CPU threads: 12; OS: Windows 10.0; UI render: GL; 
Locale: ru-RU (ru_RU); Calc: CL

nor with Version: 5.4.4.2
Build ID: 2524958677847fb3bb44820e40380acbe820f960
CPU threads: 12; OS: Windows 6.2; UI render: default; 
Locale: ru-RU (ru_RU); Calc: group;

neither using copy-paste to cell, nor to formula bar, nor using 2212->Alt+X to use "normal" keyboard input. I suppose that OP used manual right justification on cells, thus "automatic" left justification for texts naturally doesn't work.
Comment 2 mwelinder 2018-04-16 14:26:49 UTC
> I suppose that OP used manual right justification

No.  This is into a fresh, empty workbook.

However, I lied when I said "enter" in lowriter.  I don't know how to do
that, so I actually pasted into lowriter (from Gnumeric) which seems to
have attached some kind of formatting to it.  Mea culpa.

FYI, localc is currently requesting text/html format from Gnumeric and
that's really not a good idea.  I will work on that on my end.
Comment 3 Buovjaga 2018-04-23 09:40:33 UTC
Well the point is to understand Unicode minus so let's set to NEW :)
Comment 4 intmianol 2018-11-05 07:36:26 UTC Comment hidden (me-too)
Comment 5 QA Administrators 2021-02-04 04:25:51 UTC Comment hidden (obsolete)
Comment 6 b. 2021-05-07 10:45:15 UTC Comment hidden (obsolete)
Comment 7 Andreas Heinisch 2023-03-05 10:15:19 UTC
Tested it in M$ as well and it shows the same behaviour as LO. 

Should we support the unicode minus and threat it like a normal hyphen, or should we just autocorrect it to a hypen when it is entered? 

git grep \'-\' in the sc folder shows around ~50 places where we have to change the behaviour of LO not knowing if we it causes many side effects. Opinions?
Comment 8 Mike Kaganski 2023-03-05 10:27:15 UTC
IMO, treating Unicode minus as a minus is better (we could replace it to hyphen *internally* for recognition, it would likely make things easier). Autocorrection at entry would be wrong - especially when the direction is from the more "logically correct" character to the poor generic replacement from the pre-Unicode times.
Comment 9 Buovjaga 2023-03-05 11:34:00 UTC
A note on why I proposed this task to Andreas:
MediaWiki has their magic word {{formatnum:}} render Unicode minuses, so formulas with negative values copied from Calc function wiki articles will be broken when pasted into Calc.

https://phabricator.wikimedia.org/rMW7f61804bf5c717c5059f92fcccf95b470321c295
Comment 10 Buovjaga 2023-03-05 12:19:24 UTC
(In reply to Buovjaga from comment #9)
> A note on why I proposed this task to Andreas:
> MediaWiki has their magic word {{formatnum:}} render Unicode minuses, so
> formulas with negative values copied from Calc function wiki articles will
> be broken when pasted into Calc.
> 
> https://phabricator.wikimedia.org/rMW7f61804bf5c717c5059f92fcccf95b470321c295

Another note: my idea is useless because as Mike pointed out in the chat today, formulas are a kind of programming language and should thus be more strict. So I should just adjust the wiki articles in this case.
Comment 11 Mike Kaganski 2023-03-05 12:27:49 UTC
(In reply to Buovjaga from comment #9)

We had some short discussion about this on IRC; my assumption is that this bug is *not* related to the formulas, which have a strict syntax, and thus should continue to *not* accept the Unicode minus.

Eike, could you advise here?
Comment 12 Eike Rathke 2023-03-05 15:05:23 UTC
I think we could accept U+2212 − MINUS SIGN for _number input_ in the number scanner, but not preserve it. One would have to apply a number format like
General;"−"General
if that is wanted.

I'm reluctant to support it in formula expressions as well, but so far don't see a compelling reason to not do it (for pasted numbers for example); again, of course not preserving the character.

I'd refrain from adding it to the dreaded lazy data typist mode that can start formulas with + or - characters instead of =, that is already confusing enough.
Anything showing up with
git grep \'-\' sc
probably should not be touched at all.

Btw, I do not see that such pasted string would result in right-justified cell content as mentioned in comment 0. For me it's left-justified.
Comment 13 Commit Notification 2023-03-14 20:40:01 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/34510e6e57e58fb27071564f546bbd420404e66d

tdf#117037 - Support Unicode minus (0x2212) in the number scanner

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2023-03-15 15:48:36 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/2386279c4f0d91dc0758157578ecf62ae9e1ceaa

tdf#117037: svl_qa_cppunit: Add unittest

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Andreas Heinisch 2023-03-15 15:52:39 UTC
Thank you Xisco for the unit test!
Comment 16 Buovjaga 2023-03-15 16:50:43 UTC
I confirm it now works for values, but is the final word that we won't accept unicode minuses in function arguments?

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 69b0fa8a6267a1fa77e77405000f42e8aeba5fa0
CPU threads: 8; OS: Linux 6.2; UI render: default; VCL: kf5 (cairo+xcb)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Comment 17 Eike Rathke 2023-03-16 10:36:10 UTC
If you read comment 12 you'll see that there is no such final word. However, I don't see much importance, how many formula expressions are created with pasted values containing Unicode minus? Furthermore the compiled expression yields a #NAME! error so there's no doubt it's wrong. Anyway, iff such thing should be supported in the formula compiler then it should _only_ be done for UI input, not any other formula language we support (ODFF, OOXML, API, ...).

If the problem is that our Wiki renders Calc formulas with Unicode minus (where even?) as mentioned in comment 9 then change the Wiki formatting.
Comment 18 Buovjaga 2023-03-16 11:16:28 UTC
(In reply to Eike Rathke from comment #17)
> If you read comment 12 you'll see that there is no such final word. However,
> I don't see much importance, how many formula expressions are created with
> pasted values containing Unicode minus? Furthermore the compiled expression
> yields a #NAME! error so there's no doubt it's wrong. Anyway, iff such thing
> should be supported in the formula compiler then it should _only_ be done
> for UI input, not any other formula language we support (ODFF, OOXML, API,
> ...).
> 
> If the problem is that our Wiki renders Calc formulas with Unicode minus
> (where even?) as mentioned in comment 9 then change the Wiki formatting.

Hmm, now that I think about it, I *could* add a JavaScript widget that provides a copying helper like we have in Help. So I could sanitise the minus before it hits the clipboard. Thanks for pushing back and making me have more ideas, I guess :)
Comment 19 Buovjaga 2023-03-16 17:13:03 UTC
(In reply to Buovjaga from comment #18)
> Hmm, now that I think about it, I *could* add a JavaScript widget that
> provides a copying helper like we have in Help. So I could sanitise the
> minus before it hits the clipboard. Thanks for pushing back and making me
> have more ideas, I guess :)

Went with an even simpler option, just replace minuses with hyphens:
https://wiki.documentfoundation.org/WikiAction/edit/Widget:MinusToHyphen

Convenient to target as we use the <bdi> element with negative numbers due to RTL compatibility.
Comment 20 Commit Notification 2023-04-03 19:04:45 UTC
Andreas Heinisch committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/738eed58c12e74b1dd0d1d8f8d741448bde17c2c

tdf#117037 - Support Unicode minus (0x2212) in the number scanner

It will be available in 7.5.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.