Bug 66395 - Compilation error in Windows with UTF-8 unfriendly codepage
Summary: Compilation error in Windows with UTF-8 unfriendly codepage
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.2.0.0.alpha0+ Master
Hardware: Other Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-30 12:41 UTC by Isamu Mogi
Modified: 2013-08-31 13:21 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Full error message (2.24 KB, text/plain)
2013-06-30 12:41 UTC, Isamu Mogi
Details
Patch for adding BOM (2.70 KB, patch)
2013-06-30 12:47 UTC, Isamu Mogi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Isamu Mogi 2013-06-30 12:41:34 UTC
Created attachment 81737 [details]
Full error message

This is a regression of https://issues.apache.org/ooo/show_bug.cgi?id=76970

Step to reproduce on Windows XP

1. Open "Control Panel" -> "Regional and Language Options"
2. Select "Languages" tab
3. Check "Install files for East Asian languages"
4. Select "Advanced" tab
5. Select "Japanese" of "Language for non-Unicode programs"
6. Check "932 (ANSI/OEM - Japanese Shift-JIS)" of "Code page conversion tables"
7. Restart computer
8. Start to build libreoffice
9. The Build will fail with following message

$ /opt/lo/bin/make
/opt/lo/bin/make -j 1 -rs -f C:/libo/Makefile.gbuild
[build DEP] LNK:Library/ivcl.lib
[build CXX] vcl/win/source/window/keynames.cxx
C:/libo/vcl/win/source/window/keynames.cxx : warning C4819: The file contains a character that cannot be represented in the current code page (932). Save the file in Unicode format to prevent data loss
C:/libo/vcl/win/source/window/keynames.cxx(126) : error C2001: newline in constant
(snip)
make[1]: *** [C:/libo/workdir/wntmsci13.pro/CxxObject/vcl/win/source/window/keynames.o] Error 2
make: *** [build] Error 2

Full error message is attached. Also same error occurs on Windows 8.
Comment 1 Isamu Mogi 2013-06-30 12:47:48 UTC
Created attachment 81739 [details]
Patch for adding BOM

Its reason is that MSVC interprets UTF-8 source code without BOM as local codepage (ACP). Attached patch adds BOM and fixes this problem. I know that BOM for UTF-8 is redudant and not recommended. But it's only method to compile UTF-8 source normally in both GCC and MSVC with some codepage. Please review the patch.
Comment 2 Isamu Mogi 2013-06-30 12:48:29 UTC
See also:
Unicode Support in the Compiler and Linker
http://msdn.microsoft.com/en-us/library/xwy0e8f2%28v=vs.110%29.aspx
Comment 3 Isamu Mogi 2013-06-30 12:56:29 UTC
Also escaping UTF-8 chars fixes this. But it makes source code unreadable in this case.
For example in sw/qa/extras/rtfexport/rtfexport.cpp:320

  "sum from {n = 1} to {∞}"

will be converted to

  "sum from {n = 1} to {\x81\x87}"

For that reason, I think adding BOM is bit better than escaping UTF-8 chars.
Comment 4 Isamu Mogi 2013-06-30 13:03:44 UTC
Confirmed commit is def32c7e14ad9743e2b55804442be5d596f6c21c
Comment 5 Julien Nabet 2013-06-30 20:50:17 UTC
I put it at New since this issue has been reproduced.
I won't say it's a dup of fdo#66246 but rather say a more generic issue.
Comment 6 Isamu Mogi 2013-07-01 12:18:45 UTC
Sorry for lack of my search skill. I will copy these comments and attachments to fdo#66246 if there is an administrative problem.
Comment 7 Julien Nabet 2013-07-02 20:41:33 UTC
Isamu: no problem! your contribution to the fix of this bug is indeed far more important that some triaging point:-)
Comment 8 Isamu Mogi 2013-07-31 03:54:10 UTC
Julien: Thanks for your help. I'll continue to write things to this issue.
Comment 9 Isamu Mogi 2013-07-31 04:01:01 UTC
Patch was uploaded to https://gerrit.libreoffice.org/#/c/4270/ . It is waiting for reviews.
Comment 10 Isamu Mogi 2013-08-02 04:01:10 UTC
Patch was merged. Thanks!
Comment 11 Isamu Mogi 2013-08-02 04:04:28 UTC
Oh.. patch was reverted. I'll search more better fix.