Bug 32228 - Encoding problem when reading some XLS files produced by Microsoft Excel 2008 for Mac
Summary: Encoding problem when reading some XLS files produced by Microsoft Excel 2008...
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.3.0 RC1
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: XLS File-Opening
  Show dependency treegraph
 
Reported: 2010-12-08 04:53 UTC by Simon Lipp
Modified: 2021-12-13 08:45 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Problematic file (14.50 KB, application/vnd.ms-excel)
2010-12-08 04:53 UTC, Simon Lipp
Details
The B and H column of the second row is garbled (12.50 KB, application/vnd.ms-excel)
2011-05-29 08:08 UTC, lijpbasin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Lipp 2010-12-08 04:53:18 UTC
Created attachment 40913 [details]
Problematic file

Please see attached file. On Microsoft Excel 2008, second column correctly displays as “Prénom”. On LibreOffice 3.3.0.1 (MacOS X) it is displayed as “PrŽnom”. The problem is also present with OpenOffice.org Calc 3.1.1 on Linux (CentOS 5).
Comment 1 Alex Thurgood 2011-01-05 07:11:42 UTC
I can confirm that this doesn't display correctly in LibO 3.3 beta2, however, isn't this a problem with encoding in MacOffice Excel 2008 ? I seem to recall that I often had problems with the character encodings from Office 2008 for Mac, when trying to open files on other platforms, including Windows.


Alex
Comment 2 Kohei Yoshida 2011-04-29 11:35:50 UTC
Confirmed, but....

Why does MS Excel Mac version still use code page?  The Windows version has long switched to unicode after Excel 97.  Excel 95 was the last version to use code page in the default file format on Windows...
Comment 3 lijpbasin 2011-05-29 08:08:52 UTC
Created attachment 47276 [details]
The B and H column of the second row is garbled

This file is created by Microsoft Office Excel 2007, and can be displayed correctly in MS Office, the garble text are Chinese.
I tried to change the LANG enviroment variable, all chinese encoding in my machine:
LANG=zh_CN sbase grade.xls
LANG=zh_CN.gb18030 sbase grade.xls
LANG=zh_CN.gbk sbase grade.xls
LANG=zh_CN.utf-8 sbase grade.xls
none of the commands above worked
Comment 4 Urmas 2012-01-12 01:54:08 UTC
There's clearly present CODEPAGE 10000 record. I believe it should override what "Arial" or "Calibri" codepages (1252) is.

lijpbasin, do not hijack other bug reports, please.
Comment 5 QA Administrators 2014-10-23 17:31:56 UTC Comment hidden (obsolete)
Comment 6 Urmas 2014-10-24 14:31:59 UTC
Still in 4.4 master.
Comment 7 QA Administrators 2015-12-20 16:16:07 UTC Comment hidden (obsolete)
Comment 8 Joel Madero 2016-07-04 06:59:14 UTC
Version: 5.3.0.0.alpha0+
Build ID: c89294233b6a9ffc1bd75e6e9226ad723b7d5538
CPU Threads: 2; OS Version: Linux 3.16; UI Render: default; 
Locale: en-US (en_US.UTF-8)

Still an issue
Comment 9 QA Administrators 2018-09-25 02:50:51 UTC Comment hidden (obsolete)
Comment 10 paulystefan 2020-03-10 22:05:31 UTC
workaround

save doc in MSO 2016 win 10-64

as doc or docx 

and the problem is solved in windows

with LO 6.4.1.2 x64 win 10-64
Comment 11 paulystefan 2020-03-10 22:05:56 UTC Comment hidden (obsolete)
Comment 12 paulystefan 2020-06-20 18:20:21 UTC
in 7.0.0.0 beta2 64bit in windows 10 x64 

solved for me both files
Comment 13 paulystefan 2020-08-09 23:34:24 UTC
in 7.0.0.3 64bit in windows 10 x64 

solved for me both files
Comment 14 Julien Nabet 2020-09-08 12:36:26 UTC
On pc Debian x86-64 with LO Debian package 7.0.1.2, I don't reproduce this for both files.

Any update with a recent LO version?
Comment 15 QA Administrators 2021-03-08 04:03:03 UTC Comment hidden (obsolete)
Comment 16 QA Administrators 2021-04-08 03:40:52 UTC Comment hidden (obsolete)
Comment 17 Mike Kaganski 2021-12-13 08:45:13 UTC Comment hidden (obsolete)
Comment 18 Mike Kaganski 2021-12-13 08:45:14 UTC Comment hidden (obsolete)