Bug 60673 - FILEOPEN XLSX: Diacritic characters in form button labels are completely lost
Summary: FILEOPEN XLSX: Diacritic characters in form button labels are completely lost
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
Whiteboard: repro:6.0+
Keywords: filter:xlsx, preBibisect, regression
: 142164 (view as bug list)
Depends on:
Blocks: XLSX-Form-Controls
  Show dependency treegraph
Reported: 2013-02-11 16:41 UTC by Mirosław Zalewski
Modified: 2022-06-30 12:00 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:
Regression By:

File as shown in MS Office 2013 (for reference) (37.38 KB, image/png)
2013-02-11 16:41 UTC, Mirosław Zalewski
File as shown in LibreOffice 4.0.0 (37.76 KB, image/png)
2013-02-11 16:42 UTC, Mirosław Zalewski
Testing XLSX file from MS Office 2013 (13.70 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2013-02-11 16:43 UTC, Mirosław Zalewski

Note You need to log in before you can comment on or make changes to this bug.
Description Mirosław Zalewski 2013-02-11 16:41:48 UTC
Created attachment 74624 [details]
File as shown in MS Office 2013 (for reference)

Polish diacritics characters in form button labels are lost when importing XLSX from MS Office 2013.

I am attaching:
- screenshot from MS Office 2013
- screenshot from LO 4.0.0
- test document

After unzipping XLSX file, I can find content in xl/drawings/drawing1.xml. These diacritics characters sits there in plain view, encoded in UTF-8. I have no problem seeing them in vim. I can also create "push button" with label containing diacritic characters in LibreOffice Calc.

I have no idea if this bug involves only Polish diacritic characters, all non-ASCII characters or some other subset of UTF-8.
This problem is not related to font, because - as you can see- three different typefaces were used.

I can confirm this behavior on LibreOffice 3.6.5, Debian GNU/Linux (amd64). Original screenshot was taken on LO 4.0.0, Windows 8 (arch unknown).

Best regards
Mirosław Zalewski
Comment 1 Mirosław Zalewski 2013-02-11 16:42:25 UTC
Created attachment 74625 [details]
File as shown in LibreOffice 4.0.0
Comment 2 Mirosław Zalewski 2013-02-11 16:43:29 UTC
Created attachment 74626 [details]
Testing XLSX file from MS Office 2013
Comment 3 Urmas 2013-02-11 23:34:47 UTC
Confirmed with master.
Comment 4 QA Administrators 2015-02-19 15:37:25 UTC Comment hidden (obsolete)
Comment 5 Buovjaga 2015-03-07 13:22:12 UTC

Win 7 Pro 64-bit, LibO Version:
Build ID: 45e2de17089c24a1fa810c8f975a7171ba4cd432
Locale: fi_FI
Comment 6 tommy27 2016-04-16 07:22:48 UTC Comment hidden (obsolete)
Comment 7 QA Administrators 2017-05-22 13:22:12 UTC Comment hidden (obsolete)
Comment 8 Timur 2017-09-08 10:19:27 UTC Comment hidden (obsolete)
Comment 9 QA Administrators 2018-09-09 02:39:48 UTC Comment hidden (obsolete)
Comment 10 Thomas Lendo 2018-10-27 18:15:52 UTC
Still reproducible.

Build ID: 5d2ab49cbda1d7aea1019478abe0163e1f40a121
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: de-DE (de_DE.UTF-8); Calc: threaded
from today
Comment 11 Timur 2019-05-15 13:45:25 UTC
Repro 6.3+. Regression from 3.5.0, worked with 3.4.4.
Comment 12 Aron Budea 2021-08-13 05:20:27 UTC
Already buggy in the oldest of bibisect-43all.
Comment 13 Andreas Heinisch 2022-06-30 10:11:01 UTC
*** Bug 142164 has been marked as a duplicate of this bug. ***
Comment 15 Andreas Heinisch 2022-06-30 11:10:01 UTC
Very strange behaviour in https://opengrok.libreoffice.org/xref/core/sax/source/fastparser/fastparser.cxx?r=d203d3ae#489

When there are unicode characters in the lables of buttons, textboxes, etc., there is no pContext: XFastContextHandler * pContext( maContextStack.top().mxContext.get() );