Bug Hunting Session
Bug 33089 - Calc export of non-gregorian date to Excel loses format
Summary: Calc export of non-gregorian date to Excel loses format
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.3.0 release
Hardware: All All
: medium normal
Assignee: Kohei Yoshida
URL:
Whiteboard: target:3.4
Keywords:
Depends on:
Blocks: 33891
  Show dependency treegraph
 
Reported: 2011-01-13 23:37 UTC by Tantai
Modified: 2011-04-06 13:47 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
The bug document (73.50 KB, application/vnd.ms-excel)
2011-01-13 23:37 UTC, Tantai
Details
fix date format export to excel and import to calc (3.82 KB, patch)
2011-02-20 18:33 UTC, Tantai
Details
Excel file with a cell with date with locale id 0 ([$-1070000]) (13.50 KB, application/vnd.ms-excel)
2011-02-23 22:16 UTC, Tantai
Details
Better patch which doesn't allow LCID 0 (6.16 KB, patch)
2011-03-09 23:39 UTC, Samphan Raruenrom
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tantai 2011-01-13 23:37:31 UTC
Created attachment 42001 [details]
The bug document

Steps to reproduce:

1. Start OpenOffice.org Calc
2. Tool -> Option -> Language Setting -> Languages and set Locale setting to Thai
3. Type a date (ex. 15/05/2008) in a cell
4. Click the cell and choose Format -> Cells -> Numbers, then changes the 
following 
  4.1) Change Category to Date
  4.2) Change Language to Thai
	 4.3) Change Fomat to 31 ธันวาคม 2542
5. Click OK
6. Save as Microsoft Excel 97/2000/XP (.xls)
7. Exit OpenOffice.org Calc
8. Open the xls file in Microsoft Excel
  Excel will show error dialog box "Some number formats may have been lost."
9. Observe the Date will be formatted as an integer number
Comment 1 Tantai 2011-02-20 18:33:00 UTC
Created attachment 43582 [details]
fix date format export to excel and import to calc
Comment 2 Kohei Yoshida 2011-02-21 13:15:25 UTC
Tantai,

I would help if you could explain your code change a bit, why it was not working, and how you made that to work.  Otherwise I would have hard time understanding the logic behind your patch, in order give it a fair and proper review.

Thanks a lot!
Comment 3 Kohei Yoshida 2011-02-21 13:19:24 UTC
Making myself the assignee for now.
Comment 4 Samphan Raruenrom 2011-02-21 21:19:35 UTC
Problem definition:
Currently, Calc is unable to export/import to/from xls, cells which contain date and use number format code with
- from Calc, "[~buddhist]" or other calendar specifier or [NatNumx]
- from Excel,  non-western calendar type and numeral shape code in MS LCID.

For export, when the converted xls file is opened in Excel :-
- for number format containing calendar specifier, the date will be shown as an integer number
- for number format containing natnum specifier, the specifier will be ignored

For import, when the xls is openned in Calc :-
- the calendar type and numeral shape code are ignored. Only locale code are converted.

Cause:
- Calc export filter doesn’t handle calendar and native number specifier. It seems to export the strings as-is.
- Calc import filter doesn’t handle conversion of MS LCID correctly.

References about MS LCID:
http://office.microsoft.com/en-us/excel-help/creating-international-number-formats-HA001034635.aspx#BMcalendartype

--------------------------
Patch #1:-

diff -r 2c70ae736e88 svl/source/numbers/zformat.cxx
--- a/svl/source/numbers/zformat.cxx    Fri Nov 12 10:40:36 2010 +0100
+++ b/svl/source/numbers/zformat.cxx    Thu Dec 02 10:24:04 2010 +0700
@@ -1134,7 +1134,7 @@
            return LANGUAGE_DONTKNOW;
        ++nPos;
    }
-    return (nNum && (cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
+    return ((cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
        LANGUAGE_DONTKNOW;
}

The complete code:-

1099 LanguageType SvNumberformat::ImpGetLanguageType( const String& rString,
1100         xub_StrLen& nPos )
1101 {
1102     sal_Int32 nNum = 0;
1103     sal_Unicode cToken = 0;
1104     xub_StrLen nLen = rString.Len();
1105     while ( nPos < nLen && ((cToken = rString.GetChar(nPos)) != ']') )
1106     {
1107         if ( '0' <= cToken && cToken <= '9' )
1108         {
1109             nNum *= 16;
1110             nNum += cToken - '0';
1111         }
1112         else if ( 'a' <= cToken && cToken <= 'f' )
1113         {
1114             nNum *= 16;
1115             nNum += cToken - 'a' + 10;
1116         }
1117         else if ( 'A' <= cToken && cToken <= 'F' )
1118         {
1119             nNum *= 16;
1120             nNum += cToken - 'A' + 10;
1121         }
1122         else
1123             return LANGUAGE_DONTKNOW;
1124         ++nPos;
1125     }
-    return (nNum && (cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
+    return ((cToken == ']' || nPos == nLen)) ? (LanguageType)nNum :
1127         LANGUAGE_DONTKNOW;
1128 }

Description:
It is possible to have LCID that use system locale (0) such as [$-1070000] (the last 4 digits is the locale code). This fix enable the intepretation of locale code 0 as LANGUAGE_SYSTEM (instead of LANGAUGE_DONTKNOW).


--------------------------
Patch #2 :-

In "short SvNumberformat::ImpNextSymbol(String& rString,  xub_StrLen& nPos, String& sSymbol)"
This is part of the import filter. 


@@ -1220,6 +1220,16 @@
                    {
                        if ( rString.GetChar(nPos) == '-' )
                        {   // [$-xxx] locale
+                            if ( rString.GetChar(nPos+2) == '0' && rString.GetChar(nPos+3) == '7' ) // calendar type code "07" = Thai
+                            {
+                              rString.InsertAscii( "[~buddhist]", nPos+9 );
+                              nLen += 11;
+                            }
+                            if ( rString.GetChar(nPos+1) == 'D' ) // numeral shape code "D" = Thai digits
+                            {
+                              rString.InsertAscii( "[NatNum1]", nPos+9 );
+                              nLen += 9;
+                            }
                            sSymbol.EraseAllChars('[');
                            eSymbolType = BRACKET_SYMBOLTYPE_LOCALE;
                            eState = SsGetPrefix;

Description:
The fix enable the translation of calendar type code "07" to [~buddhist] and numeral shape code "D" to [NatNum1].

--------------------------
Patch #3 :-

In "String SvNumberformat::GetMappedFormatstring(const NfKeywordTable& rKeywords, const LocaleDataWrapper& rLocWrp, BOOL bDontQuote ) const"
This is part of the export filter. 

@@ -4211,6 +4221,7 @@
            nSem++;

        String aPrefix;
+        bool LCIDInserted = FALSE;

        if ( !bDefaults )
        {
@@ -4244,14 +4255,6 @@
        }

        const SvNumberNatNum& rNum = NumFor[n].GetNatNum();
-        // The Thai T NatNum modifier during Xcl export.
-        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
-                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
-                MsLangId::getRealLanguage( rNum.GetLang()) ==
-                LANGUAGE_THAI)
-        {
-            aPrefix += 't';     // must be lowercase, otherwise taken as literal
-        }

***** the above code is moved toward the end of the function

        USHORT nAnz = NumFor[n].GetnAnz();
        if ( nSem && (nAnz || aPrefix.Len()) )
@@ -4311,6 +4314,24 @@
                                aStr += '"';
                            }
                            break;
+                        case NF_SYMBOLTYPE_CALDEL :
+                            if ( pStr[j+1].EqualsAscii("buddhist") )
+                            {
+                                aStr.InsertAscii( "[$-", aStr.Len() );
+                                if ( rNum.IsSet() && rNum.GetNatNum() == 1 &&
+                                        MsLangId::getRealLanguage( rNum.GetLang() ) ==
+                                        LANGUAGE_THAI )
+                                {
+                                    aStr.InsertAscii( "D07041E]", aStr.Len() ); // date in Thai digit, Buddhist era
+                                }
+                                else
+                                {
+                                    aStr.InsertAscii( "107041E]", aStr.Len() ); // date in Arabic digit, Buddhist era
+                                }
+                                j = j+2;
+                            }
+                            LCIDInserted = TRUE;
+                        break;
                        default:
                            aStr += pStr[j];
                    }

Description:
The fix enable the translation of the two cases of Thai date in Buddhist era, using Thai digits and using Arabic digits.

   
@@ -4318,6 +4339,15 @@
                }
            }
        }
+        // The Thai T NatNum modifier during Xcl export.
+        if (rNum.IsSet() && rNum.GetNatNum() == 1 &&
+                rKeywords[NF_KEY_THAI_T].EqualsAscii( "T") &&
+                MsLangId::getRealLanguage( rNum.GetLang()) ==
+                LANGUAGE_THAI && !LCIDInserted )
+        {
+            
+            aStr.InsertAscii( "[$-D00041E]", 0 ); // number in Thai digit
+        }
    }
    for ( ; nSub<4 && bDefault[nSub]; ++nSub )
    {   // append empty subformats

Description:
This fix handle the last case of numbers using Thai digits. The code is moved from the original place so that we check for calendar specifier first. The code is a bit modified to always generate LCID instead of the 't' prefix, because LCID work for all version of MS Office but 't' prefix only work for Thai version of MS Office.
Comment 5 Kohei Yoshida 2011-02-23 08:41:02 UTC
I need some help understanding this better.

So, these [~buddhist] and [NatNumX] designations are something unique to OOo/LibO?  IOW, Excel doesn't use these labels in its number format system, correct?
Comment 6 Kohei Yoshida 2011-02-23 08:45:30 UTC
Also, the patch is at best a hack to get it to work for Thai locale, but disregards all the other locales mentioned in 

http://office.microsoft.com/en-us/excel-help/creating-international-number-formats-HA001034635.aspx

To fix this the right way, we need to process it in a more generic way which would also make things better for other locales as well (not just Thai).
Comment 7 Kohei Yoshida 2011-02-23 10:25:38 UTC
Review of attachment 43582 [details]:

Ok.  I'm playing with this patch review thingie in this bugzilla for the first time. ;-)

I'm afraid we can't apply the patch as-is & the patch needs some refinement.

SvNumberformat::ImpGetLanguageType() parses the content of [$-xxxxxxx], and if the number is zero, then there is indeed a problem before the code path even gets here.  So, handling that situation as if nothing wrong has happened is not correct.  If you are encountering that problem, then the problem is actually elsewhere, not in this method.  You need to find out what the actual problem is.

The change that begins with line 1223, the patch only handles the Thai locale, but we need to generalize it so that it picks up other locales.

The same with the rest of the change; we should generalize that a bit to make it re-usable for other affected locales.
Comment 8 Kohei Yoshida 2011-02-23 14:22:55 UTC
But in all fairness, this number format code is pretty much full of hacks it would be hard to put the right fix in it.  Still looking around to see how we could put a fix the right way...
Comment 9 Kohei Yoshida 2011-02-23 19:10:22 UTC
Well, actually we *do* put lots of special cases just for Thai locale.  Thai Excel must do lots of things differently it seems.
Comment 10 Samphan Raruenrom 2011-02-23 19:53:48 UTC
FYI: 
To see how the patch was developed, see comments in OpenOffice.org Issue Tracker
http://www.openoffice.org/issues/show_bug.cgi?id=93503

Info about various natnums
http://api.openoffice.org/docs/common/ref/com/sun/star/i18n/NativeNumberMode.html

Can't find the definitive reference for various calendars in LibO/OOo but here's the rough list, AKAIK
gengou, ROC, hanja, hijri, buddhist, gregorian

I agree that the patch should work for all locales and we've tried our best to do so. However, we have too little info on conversion of dates between LO calendars and MSO calendars (which has different list of calendars, to make things harder). Now that we have more people working on this patch, we should be able to convert more calendar/natnum specifier.
Comment 11 Kohei Yoshida 2011-02-23 21:27:00 UTC
I have a request.  Could you guys provide an Excel file generated by Thai version of Excel that contains the locale ID of 0 (i.e. [$-1070000]) ?  When I set Thai number format using English version of Excel, the LCID is always non-zero.  I need one with an LCID of 0.  Thanks.
Comment 12 Tantai 2011-02-23 22:16:12 UTC
Created attachment 43742 [details]
Excel file with a cell with date with locale id 0 ([$-1070000])

Step to create this file:-
In MS Office 2003 Excel
1. Type 24/02/2011 in cell A1
2. Choose Format > Cell from menu bar
3. In the Number tab, choose the following options
  3.1 Category: Date
  3.2 Locale (location): Thai
  3.3 Type: 14/03/2544 (the forth item)
4. Choose Category: Custom, you will see locale id 0 ([$-1070000]) in the Type code box
5. Click OK
Comment 13 Kohei Yoshida 2011-02-23 22:51:14 UTC
Ah.  Even the English version of Excel can generate that.  Interesting.
Comment 14 Kohei Yoshida 2011-02-23 22:52:51 UTC
I wonder if that's a bug in Excel, because I tried with the Japanese version of Excel, in the Japanese version of Windows to select gengou calendar type, and it still generates non-zero LCID.
Comment 15 Tantai 2011-02-23 22:57:05 UTC
Locale id 0 is a defined value, defined as LANGUAGE_SYSTEM in lang.h

http://opengrok.libreoffice.org/xref/libs-gui/i18npool/inc/i18npool/lang.h#103
Comment 16 Kohei Yoshida 2011-02-23 23:02:15 UTC
In OOo, yes, but we are talking about Excel here, and the *English* version of Excel even produces the same code, which if this LCID of 0 means system it's English, right?

So, that value of 0 shouldn't be interpreted as the system locale since my system locale is US English.
Comment 17 Kohei Yoshida 2011-02-23 23:19:18 UTC
Well, the code still specifies the Thai calendar type, so we could infer the Thai locale in such cases even when LCID is zero.

Excel seems to allow similar treatment for Japanese gengou calendar type.  As long as the calendar type is set to gengou, the LCID can be left as zero, and it still displays value in gengou format, in *US English* system locale, i.e. the code for that is [$-030411] (03 for gengou calendar and 0411 for the Japanese locale), but [$-030000] also generates absolutely the same format.
Comment 18 Samphan Raruenrom 2011-02-24 01:22:59 UTC
I'll try to explain the need for the first hack, then we can discuss better hack for it.

The function SvNumberformat::ImpGetLanguageType() is called from only single place in 
http://opengrok.libreoffice.org/xref/libs-gui/svl/source/numbers/zformat.cxx#801

And is used for both import and export from/to xls. For import, nNum (that will be return) will never be 0 (Excel LCID 0 is impossible). However, for export, 0 is possible because Calc locale id [$-0] is possible if the cell was converted from Excel and the cell had locale code 0 (for whatever reason).

Without my hack, the function will return LANGUAGE_NONE which stop the current conversion and result in the date being converted as integer in Excel. So we need to handle nNum==0. However, the ideal solution is to return the locale of the document. Is it possible?
Comment 19 Kohei Yoshida 2011-02-24 05:47:19 UTC
(In reply to comment #18)
> I'll try to explain the need for the first hack, then we can discuss better
> hack for it.
> 
> The function SvNumberformat::ImpGetLanguageType() is called from only single
> place in 
> http://opengrok.libreoffice.org/xref/libs-gui/svl/source/numbers/zformat.cxx#801
> 
> And is used for both import and export from/to xls. For import, nNum (that will
> be return) will never be 0 (Excel LCID 0 is impossible). However, for export, 0
> is possible because Calc locale id [$-0] is possible if the cell was converted
> from Excel and the cell had locale code 0 (for whatever reason).

And my preferences is to find out why the cell has a locale code of 0, and see if that's really intentional, before putting this hack in.  A part of me is not very comfortable with this change, and from what I can see Eike (er) sees it the same way.

I'm not saying your hack is wrong, but we need to do a little more due diligence before deciding that's truly the best solution here.

And my preferred approach is, instead of converting the code from excel (i.e. [$-xxxxxxxx]) on import, we should retain it and use it again on export.  That's probably the cleanest and best approach IMO.
Comment 20 Kohei Yoshida 2011-02-24 08:47:23 UTC
(In reply to comment #18)

> Without my hack, the function will return LANGUAGE_NONE which stop the current
> conversion and result in the date being converted as integer in Excel. So we
> need to handle nNum==0. However, the ideal solution is to return the locale of
> the document. Is it possible?

And I'll dig a bit deeper to find out more about this.
Comment 21 Kohei Yoshida 2011-02-24 10:04:12 UTC
The first issue is that, we are stripping the higher 3-4 digits from the code on import i.e. when we are given [$-107041E], we are only keeping the 041E part while throwing away the rest.  That's the first thing that needs to be fixed so that we can keep the whole code.
Comment 22 Kohei Yoshida 2011-02-24 10:10:58 UTC
And that's ultimately causing the LCID for the cell to be set to 0!  That's unintentional we need to fix that.
Comment 23 Samphan Raruenrom 2011-02-24 18:58:17 UTC
I'm not totally agree with that.
We've just done some experimental with Excel international number format code. Let's start with the definitions.

Excel international number format code = $[<1-2 hex-digit numeric-shape code><2 hex-digit calendar-type code><required 4 hex-digit LCID]
e.g. $[D07041E] means Thai digits, Buddhist calendar, LCID=Thai
Leading zero can be suppressed.

Calc international number format code = <optional natnum specifier><optional calendar specifier><optional locale-id>
e.g. [natnum1][buddhist][$-41E]

They seems to map 1-1 but actually it's not. Excel LCID is used only to "choose the translation" of the month name. That's why Excel LCID can be 0 which means "doesn't matter". For example, LCID 0 will happen when the month will be shown as number (1-12). Contrary to our previous believe, Excel LCID doesn't have to agree with its numeric-shape code and calendar-shape code so in Excel we can have latin-digit date with buddhist calendar with English-US month name. This is impossible in Calc.

In Calc, the locale is for every components. It limits the use of calendar-type and define the real transliteration of natnum specifiers.
Comment 24 Kohei Yoshida 2011-02-24 21:44:12 UTC
So, what do you propose, if you are saying that it's impossible to map Excel's number formats to Calc's ?

BTW I'm not expecting perfect mapping, but more like best effort.  Still, what's more important is to preserve the format code for round-tripping, so that when the file is saved back to Excel, the format code is still valid.  I thought that was the original bug reported here, no?
Comment 25 Samphan Raruenrom 2011-02-24 23:20:36 UTC
> if you are saying that it's impossible to map Excel's number formats to Calc's ?

That's not what I mean. My English is killing me :P
It's just to show that Excel and Calc international number format are somewhat different in how they interpret LCID. The result of that different are:-

1) After convert Excel format to Calc format, we may get LCID 0 in Calc. For example :-
Excel: [$-1070000] = latin-digit+Buddhist-calendar+LCID=0, with month in number (not name)
-> Calc: [~buddhist][$-0]

2) When trying to convert Excel format to Calc format, some combinations that are valid in Excel are not possible in Calc. For example :-
Excel: [$-1070409] = latin-digit+Buddhist-calendar+LCID=en_US, month name in en_US
-> Calc: [~buddhist][$-0409] = invalid because [~buddhist] is not usable (ignored) in en_US locale

Best effort:
1) for the first case, we should provide non-zero LCID. We may decide which LCID to use based on digit-shape and calendar. For the above example, we would generate
-> Calc : [~buddhist][$-041E] = based on the fact that buddhist calendar is only used in th_TH locale

2) for the second case, we have to generate only the combinations that is valid in Calc

Is this OK?
Comment 26 Kohei Yoshida 2011-02-25 08:17:51 UTC
(In reply to comment #25)

> Is this OK?

Yes.  However, I want to clarify one thing.

We have two distinct issues.  One is to preserve the number format *code* coming from Excel, and being saved back to Excel, and two is to display cell value based on the format code *correctly* (or shall I say "best effort" correctness).

I'm currently focusing on point 1 - preserving the format code for round-tripping, by (hopefully) re-using Excel's format code we get from import for export.  You are describing point 2, which I totally agree with.

So, my preferred approach is to keep the Excel-style format as-is, and do the mapping as you suggest but only behind the scene (i.e. without altering the format code itself).

This approach not only helps the Thai problem that you guys reported, but also other locales' round-tripping issues.  The issues with displaying date value correctly when importing from Excel is entirely another problem, and the approach I suggest at least doesn't make it worse.

In my view, this is the best compromise given the current situation.
Comment 27 Kohei Yoshida 2011-02-25 10:28:19 UTC
I've already started working on preserving the format codes for better round-tripping with Excel.  I'll try to see if I can do the mapping for your use case.
Comment 28 Samphan Raruenrom 2011-02-25 18:07:26 UTC
I see.

Do you mean that, for example :- 
Excel format: [$-1070000] (latin-digit+Buddhist-calendar+LCID=0)

Instead of convert it to 
-> Calc format: [$-0][~buddhist]
(as in original LO/OOo)

Or
-> Calc format: [$-41E][~buddhist]
(my best-effort idea above, which try to guess the LCID)

You'd like the result to be
-> Calc format: [$-1070000][~buddhist]
Right?

Will changing the definition of LO locale code (from just 4-hex LCID to Excel 6-hex digit+calendar+LCID) effect any code elsewhere? I mean, is there any assumption/use of it in any other code? If not, I think this is the perfect way to go.
Comment 29 Kohei Yoshida 2011-02-25 20:11:37 UTC
(In reply to comment #28)
> I see.
> 
> Do you mean that, for example :- 
> Excel format: [$-1070000] (latin-digit+Buddhist-calendar+LCID=0)
> 
> Instead of convert it to 
> -> Calc format: [$-0][~buddhist]
> (as in original LO/OOo)
> 
> Or
> -> Calc format: [$-41E][~buddhist]
> (my best-effort idea above, which try to guess the LCID)
> 
> You'd like the result to be
> -> Calc format: [$-1070000][~buddhist]
> Right?

Not exactly.  I want it to be just '[$-1070000]' without the '[~buddhist]' part.  As I said, the idea is to *not change the format code at all* (as I repeatedly said).  The '07' code in the 1070000 already indicates Buddhist calendar so we can just use that instead of inserting an extra (and redundant) [~buddhist] symbol.  Plus Excel will not understand [~buddhist] symbol when that's in the format code (you'll get the "Some number formats may have been lost." error message as in Comment #1).  The idea is to modify the number format parser to understand this Excel style locale code, which already includes the calendar type.

> Will changing the definition of LO locale code (from just 4-hex LCID to Excel
> 6-hex digit+calendar+LCID) effect any code elsewhere?
 I mean, is there any
> assumption/use of it in any other code?

The idea is to not change the behavior of the existing symbols (so that the existing code that uses the current format code scheme will not be affected), but to add support for a *new* symbol to handle the excel style locale code.

I hope this clarifies the idea a bit.
Comment 30 Samphan Raruenrom 2011-02-25 22:52:44 UTC
So this is a real big plan. I think we should discuss it in the mailing list. 

If I understand correctly, this include :-
1) Retaining of Excel internaional format code for imported xls/xlsx. Does this mean users will see the Excel format in Format > Cell dialog box for the imported case?
2) Interpretation of Excel internaional format code everywhere, not just import/export filter. Right?

Concerns:
a) There's two ways to internationally format dates, the LO way and the Excel way. Both works. Conflict also possible.
4) Does this imply a modification to the ODF standard?
Comment 31 Kohei Yoshida 2011-02-26 08:24:50 UTC
(In reply to comment #30)
> So this is a real big plan. I think we should discuss it in the mailing list. 

Why do you think it's a big plan?  I think not.

> If I understand correctly, this include :-
> 1) Retaining of Excel internaional format code for imported xls/xlsx. Does this
> mean users will see the Excel format in Format > Cell dialog box for the
> imported case?

Yes.  Because it's the easiest and the most reliable way to approach this.  Anything else would be a bad hack that would cause other issues (like it is now).

> 2) Interpretation of Excel internaional format code everywhere, not just
> import/export filter. Right?

What do you mean everywhere?  The use case is mainly for xls import export cases.  For other cases, unless the user understands the excel style locale code, the current formatting scheme is used.  So, no change to the existing way.

> Concerns:
> a) There's two ways to internationally format dates, the LO way and the Excel
> way. Both works. Conflict also possible.

There is already a conflict, and the current solution is to ignore it.

> 4) Does this imply a modification to the ODF standard?

No.  When saving to ods, it does the usual conversion (which currently fails btw).

I'm actually surprised of the push back I'm getting from you on this.
Comment 32 Kohei Yoshida 2011-02-26 08:55:41 UTC
Let's sit on this for a while since we are already going into this circular argument that's going nowhere.  I'll work on other more pressing issues first in the mean time.
Comment 33 Samphan Raruenrom 2011-02-26 09:41:04 UTC
Sorry for sounding offensive. I ask questions simply because I try to understand your idea, not because I'm disagree. I suggest we talk about this in the mailing list because the scope of this bug is just to fix the code. However, your idea is more ambitious than fixing the code but introducing a new concept that is also visible to the users. So I guess we should discuss the plan in the list with more developers than just simply three of us. 

I think I understand your idea now. I have no preference between my approach or yours. (I don't even write the patch myself, so no reason to defend it.) I only care that our users get the best experience. So again, I'm sorry if my words sound offensive. _/|\_
Comment 34 Samphan Raruenrom 2011-03-09 23:39:12 UTC
Created attachment 44298 [details]
Better patch which doesn't allow LCID 0

We're agreed that LCID 0 from Excel should not be converted to [$-0] because it doesn't mean LANGUAGE_SYSTEM but actually means LCID is not applicable. So the LCID 0 must be ignored/removed. That also results in less hack. We no longer modify SvNumberformat::ImpGetLanguageType(). Please review the new patch.
Comment 35 Kohei Yoshida 2011-03-25 11:57:07 UTC
Ok.  I've applied your last patch, though I had to change things a bit to adjust for the latest code on master.

But I can't test for this to make sure this really fixes the problem for you guys.  Can you guys make sure you test this?

We'll branch for 3.4 pretty soon, so when that happens, I'd like you guys to try the 3.4 branch for testing.  Of course, you can try the current master branch too.

Thanks!
Comment 36 Kohei Yoshida 2011-03-25 11:58:06 UTC
I'll keep this bug open until the re-work is complete.
Comment 37 Samphan Raruenrom 2011-03-26 01:15:43 UTC
> Can you guys make sure you test this?

Yes, we produce binaries to local testers to get feedbacks until we reach with this fix. We have made an installer that with apply the binary patches to an existing installation of LO 3.3.1 (and in a few days, 3.3.2). The binary patches are being tested extensively before releasing publicly as a temporary solution. Because this is a serous problem for some organizations.
Comment 38 Kohei Yoshida 2011-04-06 13:47:40 UTC
I've filed Bug 36038 for the support of the Extended LCID format.  I'll close this one as fixed.