Bug 152723 - NatNum12 modifier, in the Spanish locales that have comma as decimal separator, the format spell out point not coma.
Summary: NatNum12 modifier, in the Spanish locales that have comma as decimal separato...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Localization (show other bugs)
Version:
(earliest affected)
7.6.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Julien Nabet
URL:
Whiteboard: target:24.2.0 target:7.6.0.0.beta2
Keywords:
Depends on:
Blocks: Decimal-Separator-Key
  Show dependency treegraph
 
Reported: 2022-12-29 20:28 UTC by m_a_riosv
Modified: 2023-06-29 14:38 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample file showing the issue. (13.79 KB, application/vnd.oasis.opendocument.spreadsheet)
2022-12-29 20:30 UTC, m_a_riosv
Details
Format cell vs NUMBERTEXT function (15.07 KB, application/vnd.oasis.opendocument.spreadsheet)
2023-01-11 22:00 UTC, m_a_riosv
Details

Note You need to log in before you can comment on or make changes to this bug.
Description m_a_riosv 2022-12-29 20:28:37 UTC
With NatNum12 modifier, in the Spanish locales, that have comma as decimal separator, the format spell out point not coma.

12,55 'Spanish (Spain)' '[NatNum12 capitalize]0,00'
'Doce punto cincuenta y cinco'
should be 'Doce coma cincuenta y cinco'
Comment 1 m_a_riosv 2022-12-29 20:30:06 UTC
Created attachment 184390 [details]
Sample file showing the issue.
Comment 2 raal 2023-01-11 17:46:32 UTC
as I know, it uses library libnumbertext https://github.com/Numbertext/libnumbertext and it's probably bug in this library.
Comment 3 m_a_riosv 2023-01-11 22:00:48 UTC
Created attachment 184600 [details]
Format cell vs NUMBERTEXT function

Hi @raal, thanks.
But the issue doesn't happen with the extension and NUMBERTEXT() function.
To see the file, the extension is needed. The last 1.0.11 2022-11-13.
https://extensions.libreoffice.org/en/extensions/show/numbertext-1

So not clear for me where the issue happens. The UI English/Spanish does not affect.
Comment 4 ady 2023-03-16 14:36:37 UTC
Using attachment 184390 [details] I can confirm that English, German and French all seem correct for USA, Germany and France.

Both Spanish cells for the two respective countries shown in attachment 184390 [details] are incorrect and should be as described in comment 0.

My doubt: is this really a LO bug? I don't have the extension mentioned in comment 3 ATM.

I am inclined to set this to NEW for now, until proven differently. Doing so.
Comment 5 ady 2023-03-17 11:23:35 UTC
I should add that some Spanish-speaking countries use point as decimal separator ([1]) whereas others use comma.

Is "[NatNum12]" able to discern between countries according to cell's locale? Or is it adjusting results depending on language of locale alone?

So, _if_ "[NatNum12]" number format has no way of distinguishing between countries (only by language of locale), then there would be no effective way to satisfy all users simultaneously by using the "[NatNum12]" format alone.

[1]:
https://en.wikipedia.org/wiki/Decimal_separator#Countries_using_decimal_point
Comment 6 Julien Nabet 2023-06-28 09:25:35 UTC
László/Mike: it works if in instdir/share/numbertext/es.sor I replace:
([-−]?\d+)[.] $1| punto
([-−]?\d+)[,] $1| coma
by:
"([-−]?\d+)[.,]" "$1| coma"

I took example on French file:
"([-−]?\d+)[.,]" "$1| virgule"

Now I don't know how to deal the fact that some expect "punto" others expect "coma".
If I put:
"([-−]?\d+)[.]" "$1| punto"
"([-−]?\d+)[,]" "$1| coma"
it still displays "punto" whereas the number has been typed with ",".

I suppose there's an internal conversion into English format which uses "."
Comment 7 Mike Kaganski 2023-06-28 09:42:20 UTC
The issue is inside LibreOffice, because:

1. libnumbertext language data [1] has a distinctive feature for Spanish (choose "es" in "Load" selector on that page, and examine the "regex" data in '# decimals' section). For 'de', for example, both '.' and ',' are handled together, showing 'Komma'; but for 'es', they are handled separately, showing 'punto' and 'coma', respectively.
2. NUMBERTEXT function from extension takes the number formatted using the cell locale, which gives the correct separator, resulting in correct output.
3. In internal case, it seems, the number is formatted using the default formatter, and relies on the libnumbertext library to process decimal point. But here, as we see, it fails.

Two options:

1. Split 'es' into 'es-Country1', 'es-Country2', etc., to allow each variant have own unified handling of both dot and comma. This should be in libnumbertext, but is not desirable IMO, because it would produce mostly duplicating data, put additional load on that project, which doesn't enjoy much volunteer attention.

2. In LibreOffice, change the code that passes the number string to libnumbertext, and at least replace the dot with the current locale decimal separator. This is much easier, and IMO is superior solution.

[1] https://numbertext.github.io/Soros.html
Comment 8 Julien Nabet 2023-06-28 13:25:39 UTC
Mike: I noticed this:
commit 7b5f5d77d56ee494647d9e7868546b3f2140896e
Author: Mike Kaganski <mike.kaganski@collabora.com>
Date:   Tue May 15 14:46:18 2018 +0100

    NatNum spelling: also spell decimals

we got this in i18npool/source/nativenumber/nativenumbersupplier.cxx:
    594         else if (ch == aSeparators.DecimalSeparator)
    595             // Convert any decimal separator to point - in case libnumbertext has a different one
    596             // for this locale (it seems that point is supported for all locales in libnumbertext)
    597             sBuf.append('.');


see https://opengrok.libreoffice.org/xref/core/i18npool/source/nativenumber/nativenumbersupplier.cxx?r=2a1d2d42#594

So it seems we replaced locale decimal separator by "." on purpose.
Comment 9 Mike Kaganski 2023-06-28 13:34:32 UTC
(In reply to Julien Nabet from comment #8)

:)

https://gerrit.libreoffice.org/c/core/+/54375/ starts with:

> Eike: this is addressing your comment wrt decimals

... and I definitely am unable to recall what was that about. Maybe Eike has an idea?
Comment 10 Julien Nabet 2023-06-28 19:53:48 UTC
Just for the record, with this patch, it works:
diff --git a/i18npool/source/nativenumber/nativenumbersupplier.cxx b/i18npool/source/nativenumber/nativenumbersupplier.cxx
index 756866ad846e..62a7e75278bf 100644
--- a/i18npool/source/nativenumber/nativenumbersupplier.cxx
+++ b/i18npool/source/nativenumber/nativenumbersupplier.cxx
@@ -586,15 +586,11 @@ OUString getNumberText(const Locale& rLocale, const OUString& rNumberString,
     for (i = 0; i < len; i++)
     {
         sal_Unicode ch = src[i];
-        if (isNumber(ch))
+        if (isNumber(ch) || ch == aSeparators.DecimalSeparator)
         {
             ++count;
             sBuf.append(ch);
         }
-        else if (ch == aSeparators.DecimalSeparator)
-            // Convert any decimal separator to point - in case libnumbertext has a different one
-            // for this locale (it seems that point is supported for all locales in libnumbertext)
-            sBuf.append('.');
         else if (ch == aSeparators.ThousandSeparator && count > 0)
             continue;
         else if (isMinus(ch) && count == 0)
Comment 11 Mike Kaganski 2023-06-29 05:16:47 UTC
(In reply to Julien Nabet from comment #10)

Please proceed with this - it is most likely the correct thing to do, and my code back then was overly defensive thinking, not knowing such facts.
Comment 12 Julien Nabet 2023-06-29 06:37:06 UTC
(In reply to Mike Kaganski from comment #11)
> (In reply to Julien Nabet from comment #10)
> 
> Please proceed with this - it is most likely the correct thing to do, and my
> code back then was overly defensive thinking, not knowing such facts.

Thank you for the feedback, I've submitted this on gerrit:
https://gerrit.libreoffice.org/c/core/+/153732
Comment 13 Commit Notification 2023-06-29 08:27:17 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/db591df99424dd7e1837946b36ca0658a02d520c

tdf#152723: NatNum12 modifier, keep decimal separator from local settings

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2023-06-29 11:36:02 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":

https://git.libreoffice.org/core/commit/1cf6762b595f8efeb1f84b10a712a4613f6aaffb

tdf#152723: NatNum12 modifier, keep decimal separator from local settings

It will be available in 7.6.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 m_a_riosv 2023-06-29 14:38:14 UTC
Thanks @Julien. I'll verify when available.