Bug 94232 - FILEOPEN: XLSX - Data labels with comma separator imported as semicolon
Summary: FILEOPEN: XLSX - Data labels with comma separator imported as semicolon
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Chart (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: low minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:xlsx
Depends on:
Blocks: OOXML-Chart Chart-Labels
  Show dependency treegraph
 
Reported: 2015-09-15 08:13 UTC by Yousuf Philips (jay) (retired)
Modified: 2019-02-22 13:28 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
sample (12.31 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2015-09-15 08:13 UTC, Yousuf Philips (jay) (retired)
Details
screenshot (112.70 KB, image/png)
2015-09-15 09:55 UTC, Yousuf Philips (jay) (retired)
Details
screenshot from excel 2013 (48.95 KB, image/png)
2015-09-16 02:08 UTC, Yousuf Philips (jay) (retired)
Details
printscreen from excel2010 (17.02 KB, image/png)
2015-09-16 05:11 UTC, raal
Details
Label formating Window (64.95 KB, image/png)
2017-09-02 12:43 UTC, Jacques Guilleron
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yousuf Philips (jay) (retired) 2015-09-15 08:13:58 UTC
Created attachment 118730 [details]
sample

Steps:
1) Open attached file
2) Notice that the second bar has a label 'Yousuf; 2', but it is supposed to be 'Yousuf, 2'.

Version: 5.1.0.0.alpha1+
Build ID: eb2e1ab4651350bffc53f618961a910bd3bbcfd9
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2015-09-02_23:57:52
Locale: en-US (en_US.UTF-8)
Comment 1 Jacques Guilleron 2015-09-15 09:39:38 UTC
Hi Yousuf,

I got the semicolon by importing the file in Calc, but also by opening it in Excel 2010, and changing there the separator to coma, gives a coma after import in Calc.
Comment 2 Yousuf Philips (jay) (retired) 2015-09-15 09:55:08 UTC
Created attachment 118735 [details]
screenshot

Hi Jacques,

This file was created with excel 2010.
Comment 3 raal 2015-09-15 11:13:06 UTC
Excel 2010 and LO 5.0.1.2: in both is label 'Yousuf; 2' => problem is in excel2013?
Comment 4 Yousuf Philips (jay) (retired) 2015-09-16 02:08:45 UTC
Created attachment 118756 [details]
screenshot from excel 2013
Comment 5 raal 2015-09-16 05:11:04 UTC
Created attachment 118758 [details]
printscreen from excel2010
Comment 6 Buovjaga 2015-10-10 12:42:23 UTC
Excel 2013 displays Yousuf; 2 for me (same as LibO).

Win 8.1 32-bit
MSO 2013
LibO Version: 5.0.3.1
Build ID: fd8cfc22f7f58033351fcb8a83b92acbadb0749e
Locale: fi-FI (fi_FI)
Comment 7 Yousuf Philips (jay) (retired) 2015-10-10 19:06:44 UTC
Marcus stated on IRC "it is not a bug, the file does not specify the separator so the application selects one (most likely based on the UI language)". But when i asked him whether LO does the same of selecting an appropriate separator based on UI language, he said he didnt know.
Comment 8 Robinson Tryon (qubit) 2016-08-05 22:29:01 UTC
(In reply to Yousuf (Jay) Philips from comment #7)
> Marcus stated on IRC "it is not a bug, the file does not specify the
> separator so the application selects one (most likely based on the UI
> language)".

Ok, so NOTABUG

> But when i asked him whether LO does the same of selecting an
> appropriate separator based on UI language, he said he didnt know.

Got an appropriate test to use to confirm this one way or another?

(I'll toss in NEEDINFO for now)
Comment 9 Yousuf Philips (jay) (retired) 2016-08-06 05:31:31 UTC
@Eike: Can you give your input on this so it can be closed as NOTABUG if possible?
Comment 10 Yousuf Philips (jay) (retired) 2016-08-06 05:31:57 UTC Comment hidden (obsolete)
Comment 11 Eike Rathke 2016-08-08 14:37:54 UTC
Citing Ecma OOXML:
21.2.2.166
separator (Separator)
This element specifies text that shall be used to separate the parts of a data label. The default is a comma,
except for pie charts showing only category name and percentage, when a line break shall be used instead.

But apparently Excel doesn't follow its own standard and uses a semicolon instead, or even locale dependent? Loading the file in Excel2010 with English locale I get semicolon as well.

However, if the value is absent we do the same and use "; ". We did not implement anything locale specific. It's also not clear to me what separator that would be, it might be it is the array/matrix row separator. But the Chart model doesn't know about that anyway when reading the OOXML DrawingML part.

This ranks in a very low "fix if someone wants to scratch an itch until s/he bleeds" importance, researching/implementing this wouldn't be trivial and the effort per gain ratio for me just isn't worth it.
Comment 12 Eike Rathke 2016-08-08 15:03:28 UTC
Or we just take what OOXML specifies and use comma.
Period. ;-)
Comment 13 Christian Lohmaier 2016-08-08 15:21:50 UTC Comment hidden (no-value)
Comment 14 Yousuf Philips (jay) (retired) 2016-08-09 06:46:25 UTC
@Jacques, @raal: What OS locale do you have on windows when you tested? You can used `systeminfo` to check this.

I am using 'en-us;English (United States)'.

(In reply to Eike Rathke from comment #11)
> Citing Ecma OOXML:
> 21.2.2.166
> separator (Separator)
> This element specifies text that shall be used to separate the parts of a
> data label. The default is a comma,
> except for pie charts showing only category name and percentage, when a line
> break shall be used instead.

Could there be a difference between the ecma, transitional and strict about this? By the way, how to determine if ooxml version a document is saved in?

> But apparently Excel doesn't follow its own standard and uses a semicolon
> instead, or even locale dependent? Loading the file in Excel2010 with
> English locale I get semicolon as well.

Was it english US locale?

(In reply to Eike Rathke from comment #12)
> Or we just take what OOXML specifies and use comma.
> Period. ;-)

As everyone else but me sees it as a semicolon in excel, its likely better to leave it that way. :D
Comment 15 Eike Rathke 2016-08-10 11:41:18 UTC
(In reply to Yousuf (Jay) Philips from comment #14)
> Could there be a difference between the ecma, transitional and strict about
> this?
Yes of course it could, but I didn't investigate.

> By the way, how to determine if ooxml version a document is saved in?
Probably in the namespaces and markup compatibility, but I don't know off-head details. Anyway, only MS-Office 2013 started to write OOXML Strict, everything else is transitional, and the default still is transitional.

> Was it english US locale?
Yes.

> (In reply to Eike Rathke from comment #12)
> > Or we just take what OOXML specifies and use comma.
> > Period. ;-)
> 
> As everyone else but me sees it as a semicolon in excel, its likely better
> to leave it that way. :D
Likely. Unless someone wants to get hands dirty and implement the Excel behavior.
Comment 16 QA Administrators 2017-09-01 11:16:28 UTC Comment hidden (obsolete)
Comment 17 Jacques Guilleron 2017-09-02 12:39:14 UTC
Hi all,

By digging a little, I found that the separator can be choosen for each label. There is no obligation of consistency between labels, even if we feel it as such.
I join a screenshot of the setting window.
Comment 18 Jacques Guilleron 2017-09-02 12:43:29 UTC
Created attachment 135965 [details]
Label formating Window
Comment 19 Balázs Varga 2019-02-22 13:28:39 UTC
Probably these patches could help here. But I also think this is not a real bug.

The attached file (sample) opening with MS-Office 2010/2016 and LO Master, appeared with a semicolon separator.

Balazs Varga committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/de73efb96fbb1d268caea0f41acbe20a234ec59f%5E%21

tdf#122226 OOXML Chart Import: data label new line separator

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.

and

Balazs Varga committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/42fd10b0ab6c6f65ba6394f9ae216c0f13973221%5E%21

Related: tdf#122226 OOXML Chart Import data label separator

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.