Bug 116453 - Cyrillic characters in XML show as question marks
Summary: Cyrillic characters in XML show as question marks
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.1.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Kohei Yoshida
URL: https://gitlab.com/orcus/orcus/issues/57
Whiteboard: target:6.2.0
Keywords: bibisected, bisected, regression
Depends on:
Blocks: MSO-XML2003
  Show dependency treegraph
 
Reported: 2018-03-17 16:17 UTC by Buovjaga
Modified: 2018-09-07 04:05 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Buovjaga 2018-03-17 16:17:02 UTC
Open attachment 93583 [details] from bug 74647 and see cyrillic chars as ???

Found by eisa01 on macOS

Recent regression, not yet in the latest in bibisect-win32-6.0

Arch Linux 64-bit
Version: 6.1.0.0.alpha0+
Build ID: 070dbae6b4dc497d6ae898e60203d25b0e608d73
CPU threads: 8; OS: Linux 4.15; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on March 17th 2018

Version: 6.1.0.0.alpha0+ (x64)
Build ID: 2537d6897ae516d3b4d50f0e2885dc24949841bf
CPU threads: 4; OS: Windows 10.0; UI render: default; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-03-16_02:34:17
Locale: fi-FI (fi_FI); Calc: group
Comment 1 Buovjaga 2018-03-17 16:19:55 UTC
File has <?xml version="1.0" encoding="windows-1251"?> and Kate recognises it as Cyrillic - cp 1251.
Comment 2 Julien Nabet 2018-03-17 16:45:25 UTC
On pc Debian x6-64 with master sources updated yesterday, I could reproduce this.

I don't reproduce this with LO Debian package 6.0.2
Comment 3 raal 2018-03-18 14:20:00 UTC
This seems to have begun at the below commit.
Adding Cc: to Kohei Yoshida ; Could you possibly take a look at this one?
Thanks
 de19941f7613db5dc62e0f0903ad9f523f3d2a16 is the first bad commit
commit de19941f7613db5dc62e0f0903ad9f523f3d2a16
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Mon Dec 18 07:57:01 2017 +0100

    source 152c79ee2be2374334202dc738a8f011e47845c7
    
author	Kohei Yoshida <kohei.yoshida@gmail.com>	2017-12-03 21:25:53 -0500
committer	Kohei Yoshida <libreoffice@kohei.us>	2017-12-18 02:30:39 +0100
commit 152c79ee2be2374334202dc738a8f011e47845c7 (patch)
tree a9dca320422e3afa66f6ed94d0ef1b0ca5899027
parent 99210a149c859fcd683870b280adaeeffd1250e4 (diff)
Initial step on enabling the orcus-based Excel 2003 XML filter.
Still some work remains in the orcus interface implementation code
in sc.
Comment 4 Kohei Yoshida 2018-04-10 23:21:49 UTC
The necessary work has been done on the orcus side.  I'll work on the LibreOffice side once I'm ready to integrate the next version of orcus (0.14.0).
Comment 5 Kohei Yoshida 2018-08-15 12:40:54 UTC
I know it's been a while, but I haven't forgotten about this bug.  I'm currently in the process of getting a new version of mdds out the door, then I'll work on releasing a new version of orcus/ixion, after which I'll work on integrating it in LibreOffice proper.
Comment 6 Kohei Yoshida 2018-09-04 22:06:32 UTC
I can take this now.
Comment 7 Commit Notification 2018-09-05 02:07:37 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5e8fd488f17fe0433cc9b31ace6527fb06ea3bb0

tdf#116453: Pick up non-UTF8 encoding and use it for string values.

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Buovjaga 2018-09-05 10:16:32 UTC
Thanks, verified!

Arch Linux 64-bit
Version: 6.2.0.0.alpha0+
Build ID: 3c86ffd8ded628e6f2b4187948a1b1056f6a0f56
CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; 
Locale: fi-FI (fi_FI.UTF-8); Calc: threaded
Built on September 5th 2018
Comment 9 Commit Notification 2018-09-06 00:19:25 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7f43f0b50135e147fb2bb1f942da3bf60153fd2c

tdf#116453: One less argument to pass to ScOrcusStyles.

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2018-09-07 04:02:50 UTC
Kohei Yoshida committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e55a85228ddb6910cbc045bb2aa06149b6f749bc

tdf#116453: Add a test case for this.

It will be available in 6.2.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Kohei Yoshida 2018-09-07 04:05:49 UTC
Now that I've added a test case for this, I hereby conclude my work.