Bug 84266 - FILEOPEN: xls from mac, bad encoding
Summary: FILEOPEN: xls from mac, bad encoding
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.2.6.2 release
Hardware: Other Linux (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-24 07:24 UTC by Puggan SE
Modified: 2015-05-06 14:23 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
One column of the file (7.50 KB, text/csv)
2014-09-24 21:50 UTC, Puggan SE
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Puggan SE 2014-09-24 07:24:23 UTC
When opening .xls files, sent from a mac user, the non ascii-characters like åäö isn't displayed as åäö, and the character behind is eaten up.

Exemple cells that should say "Göteborg" turn up as "G쉞eborg".

Have tried to select both the Excel-95-xls and the Excel-97-xls option when opening file, with the same result.
Comment 1 Puggan SE 2014-09-24 09:21:03 UTC
when using the "file"-command on the files, i get:
Composite Document File V2 Document, Little Endian, Os: MacOS, Version 10.3, Code page: 10000, Last Saved By: _____, Name of Creating Application: Microsoft Macintosh Excel, Create Time/Date: Tue Aug 12 16:09:53 2014, Last Saved Time/Date: Thu Aug 14 15:14:20 2014, Security: 0
Comment 2 Puggan SE 2014-09-24 11:24:41 UTC
Asked my contact to send a xls file with just 6 words.
That file works fine, and "file"-command say its the same type.
So maybe small files (6 rows) works, but bigger files (5000+ rows) dosn't work.

Can't figurout what convertation is made in the text.

The text "Göteborg"
in UTF-8: 47 * c3 b6 * 74 65 62 6f 72  67
in Latin1: 47 * f6 * 74 65 62 6f 72 67
in mac/CP10000: 47 * 9a * 74 65 62 6f 72 67

The text shown in Libreoffice where i expect "Göteborg":
"G쉞eborg": 47 * ec 89 9e * 65 62 6f 72  67 0a

Where did the "t" (0x74) go, and how is "ö" ending up as "EC 89 9E"
Comment 3 Julien Nabet 2014-09-24 20:42:32 UTC
Would it be possible you attach a file which triggers the problem?
(have in mind that attachments are automatically made public so the file shouldn't contain any private/confidential part).
Comment 4 Puggan SE 2014-09-24 21:44:45 UTC
The files I have are customer contact informations, so I'm not allowed to make thous list public.

Tried to make my source generate a smaller list, that could be public, but that file didn't have that problem. Don't think i can get momre help from that source.

I'l try to dig up some other source, but most friends don't use mac and don't use Excel.
Comment 5 Puggan SE 2014-09-24 21:50:33 UTC
Created attachment 106817 [details]
One column of the file

After opening the file, i removed all but one column, and saved it as csv.
I then piped it by "sort -u" to remove duplicate rows.

Most of the postcites, but not all, looks ok after piping: "iconv -f utf8 -t CP949 | iconv -f macintosh -t utf8"
Comment 6 Julien Nabet 2014-09-29 20:10:23 UTC
On pc Debian x86-64 with master sources updated 3 days ago, I could reproduce this.
But when opening the file with Vi too (a Linux editor), I also have got some "asiatic characters.

Would it be possible you ask from your MacOs user an xls (not csv) file containing just "Göteborg"?
Also, what's the UI language of this user?
Comment 7 Puggan SE 2014-09-29 21:31:17 UTC
I asked for an xls file with 6 postcities, but opening that file worked great, its just the big files, containing 5000+ rows, that give this odd characters.

The csv above is what libreoffice saved after opening one of the "evil" files.

I guess his UI is swedish.
Comment 8 QA Administrators 2015-04-01 14:47:08 UTC
Dear Bug Submitter,

This bug has been in NEEDINFO status with no change for at least
6 months. Please provide the requested information as soon as
possible and mark the bug as UNCONFIRMED. Due to regular bug
tracker maintenance, if the bug is still in NEEDINFO status with
no change in 30 days the QA team will close the bug as INVALID
due to lack of needed information.

For more information about our NEEDINFO policy please read the
wiki located here:
https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Status/NEEDINFO

If you have already provided the requested information, please
mark the bug as UNCONFIRMED so that the QA team knows that the
bug is ready to be confirmed.
 
Thank you for helping us make LibreOffice even better for everyone!


Warm Regards,
QA Team
Comment 9 Puggan SE 2015-04-01 18:57:39 UTC
More info can't be produced, as I have no access to the computer making the files, and the files i have contain sesitive information, and should not be shared.
Comment 10 Julien Nabet 2015-04-01 18:59:31 UTC
(In reply to Puggan SE from comment #9)
> More info can't be produced, as I have no access to the computer making the
> files, and the files i have contain sesitive information, and should not be
> shared.

This link may help:
https://wiki.documentfoundation.org/QA/Bugzilla/Sanitizing_Files_Before_Submission
Comment 11 QA Administrators 2015-05-06 14:21:27 UTC
Dear Bug Submitter,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INVALID due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

-- The LibreOffice QA Team 

This INVALID Message was generated on: 2015-05-06

Warm Regards,
QA Team