Bug 60673 - FILEOPEN XLSX: Diacritic characters in form button labels are completely lost
Summary: FILEOPEN XLSX: Diacritic characters in form button labels are completely lost
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Caolán McNamara
URL:
Whiteboard: repro:6.0+ target:7.5.0
Keywords: filter:xlsx, preBibisect, regression
: 142164 (view as bug list)
Depends on:
Blocks: XLSX-Form-Controls
  Show dependency treegraph
 
Reported: 2013-02-11 16:41 UTC by Mirosław Zalewski
Modified: 2022-10-12 16:54 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
File as shown in MS Office 2013 (for reference) (37.38 KB, image/png)
2013-02-11 16:41 UTC, Mirosław Zalewski
Details
File as shown in LibreOffice 4.0.0 (37.76 KB, image/png)
2013-02-11 16:42 UTC, Mirosław Zalewski
Details
Testing XLSX file from MS Office 2013 (13.70 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2013-02-11 16:43 UTC, Mirosław Zalewski
Details
The example file in current nightly (150.51 KB, image/png)
2022-10-12 07:58 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mirosław Zalewski 2013-02-11 16:41:48 UTC
Created attachment 74624 [details]
File as shown in MS Office 2013 (for reference)

Polish diacritics characters in form button labels are lost when importing XLSX from MS Office 2013.

I am attaching:
- screenshot from MS Office 2013
- screenshot from LO 4.0.0
- test document

After unzipping XLSX file, I can find content in xl/drawings/drawing1.xml. These diacritics characters sits there in plain view, encoded in UTF-8. I have no problem seeing them in vim. I can also create "push button" with label containing diacritic characters in LibreOffice Calc.

I have no idea if this bug involves only Polish diacritic characters, all non-ASCII characters or some other subset of UTF-8.
This problem is not related to font, because - as you can see- three different typefaces were used.

I can confirm this behavior on LibreOffice 3.6.5, Debian GNU/Linux (amd64). Original screenshot was taken on LO 4.0.0, Windows 8 (arch unknown).

Best regards
Mirosław Zalewski
Comment 1 Mirosław Zalewski 2013-02-11 16:42:25 UTC
Created attachment 74625 [details]
File as shown in LibreOffice 4.0.0
Comment 2 Mirosław Zalewski 2013-02-11 16:43:29 UTC
Created attachment 74626 [details]
Testing XLSX file from MS Office 2013
Comment 3 Urmas 2013-02-11 23:34:47 UTC
Confirmed with master.
Comment 4 QA Administrators 2015-02-19 15:37:25 UTC Comment hidden (obsolete)
Comment 5 Buovjaga 2015-03-07 13:22:12 UTC
Confirmed.

Win 7 Pro 64-bit, LibO Version: 4.4.1.2
Build ID: 45e2de17089c24a1fa810c8f975a7171ba4cd432
Locale: fi_FI
Comment 6 tommy27 2016-04-16 07:22:48 UTC Comment hidden (obsolete)
Comment 7 QA Administrators 2017-05-22 13:22:12 UTC Comment hidden (obsolete)
Comment 8 Timur 2017-09-08 10:19:27 UTC Comment hidden (obsolete)
Comment 9 QA Administrators 2018-09-09 02:39:48 UTC Comment hidden (obsolete)
Comment 10 Thomas Lendo 2018-10-27 18:15:52 UTC
Still reproducible.

Version: 6.2.0.0.alpha1+
Build ID: 5d2ab49cbda1d7aea1019478abe0163e1f40a121
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: de-DE (de_DE.UTF-8); Calc: threaded
from today
Comment 11 Timur 2019-05-15 13:45:25 UTC
Repro 6.3+. Regression from 3.5.0, worked with 3.4.4.
Comment 12 Aron Budea 2021-08-13 05:20:27 UTC
Already buggy in the oldest of bibisect-43all.
Comment 13 Andreas Heinisch 2022-06-30 10:11:01 UTC
*** Bug 142164 has been marked as a duplicate of this bug. ***
Comment 15 Andreas Heinisch 2022-06-30 11:10:01 UTC
Very strange behaviour in https://opengrok.libreoffice.org/xref/core/sax/source/fastparser/fastparser.cxx?r=d203d3ae#489

When there are unicode characters in the lables of buttons, textboxes, etc., there is no pContext: XFastContextHandler * pContext( maContextStack.top().mxContext.get() );
Comment 16 Gabor Kelemen (allotropia) 2022-10-12 07:58:15 UTC
Created attachment 182989 [details]
The example file in current nightly

This looks good now in

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 73911ed8d35294a9e15771d8aaa1e9121ef10309
CPU threads: 14; OS: Windows 10.0 Build 19044; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

presumably after:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=b320ef30977144c52de9b39bc4db0db540727c79

author	Caolán McNamara <caolanm@redhat.com>	2022-10-11 15:10:43 +0100
committer	Caolán McNamara <caolanm@redhat.com>	2022-10-11 20:34:05 +0200

vml whitespace-check mangled Částečně to ste n
Comment 17 Caolán McNamara 2022-10-12 08:01:12 UTC
backported to 7-4 with https://gerrit.libreoffice.org/c/core/+/141185
Comment 18 Xisco Faulí 2022-10-12 12:14:44 UTC
(In reply to Gabor Kelemen (allotropia) from comment #16)
> presumably after:
> 
> https://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=b320ef30977144c52de9b39bc4db0db540727c79
> 
> author	Caolán McNamara <caolanm@redhat.com>	2022-10-11 15:10:43 +0100
> committer	Caolán McNamara <caolanm@redhat.com>	2022-10-11 20:34:05 +0200
> 
> vml whitespace-check mangled Částečně to ste n

I do confirm the issue reappears if b320ef30977144c52de9b39bc4db0db540727c79 is reverted
Comment 19 Commit Notification 2022-10-12 16:54:03 UTC
Xisco Fauli committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/369c7d2dc4b941d5b7699b03a3cfc03ad0e3b430

tdf#60673: sc_subsequent_filters_test2: Add unittest

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.