Bug Hunting Session
Bug 95706 - FILEOPEN: RTF import doesnt interpret ascii text encoding with windows code pages
Summary: FILEOPEN: RTF import doesnt interpret ascii text encoding with windows code p...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: interoperability
Keywords: filter:rtf, needsDevEval
Depends on:
Blocks: RTF-Opening
  Show dependency treegraph
 
Reported: 2015-11-09 14:40 UTC by Dženan Zukić
Modified: 2019-04-03 02:56 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Problematic RTF file (20.45 KB, application/msword)
2015-11-09 14:40 UTC, Dženan Zukić
Details
Correct rendering by MSWord (105.56 KB, application/pdf)
2015-11-09 14:40 UTC, Dženan Zukić
Details
An incorrect rendering of the file by LibreOffice (27.38 KB, application/pdf)
2015-11-10 00:26 UTC, Dženan Zukić
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dženan Zukić 2015-11-09 14:40:16 UTC
Created attachment 120415 [details]
Problematic RTF file

Character encoding is wrong in LO. See the attached correct rendering, e.g. the title:
"Izvod po Tekućem računu ..." gets rendered as:
"Izvod po Tekuæem raèunu ...".
Comment 1 Dženan Zukić 2015-11-09 14:40:52 UTC
Created attachment 120416 [details]
Correct rendering by MSWord
Comment 2 raal 2015-11-09 20:23:32 UTC
I can not confirm with Version: 5.1.0.0.alpha1+
Build ID: c5fefe46fc9dca3942b2fc33ffd1f7e041d450e6
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2015-11-04_07:04:49
text is correct : Izvod po Tekućem računu
Comment 3 Dženan Zukić 2015-11-10 00:26:24 UTC
Created attachment 120432 [details]
An incorrect rendering of the file by LibreOffice

Generated by libo-master-2015-11-09_23.11.30_LibreOfficeDev_5.1.0.0.alpha1_Win_x64.msi
Comment 4 Dženan Zukić 2015-11-10 00:27:30 UTC
I just rechecked, and it is still wrong. I used version libo-master-2015-11-09_23.11.30_LibreOfficeDev_5.1.0.0.alpha1_Win_x64.msi
Comment 5 Buovjaga 2015-11-12 10:22:43 UTC
Repro with 5.1 and 3.5.0
5.0.3 gives read error and refuses to open it.

Win 7 Pro 64-bit, Version: 5.0.3.2 (x64)
Build ID: e5f16313668ac592c1bfb310f4390624e3dbfb75
Locale: fi-FI (fi_FI)

Version: 5.1.0.0.alpha1+
Build ID: b216cc1b8096eb60c27f67e8c27b7cd756c75e38
TinderBox: Win-x86@62-merge-TDF, Branch:MASTER, Time: 2015-11-12_00:06:20
Locale: fi-FI (fi_FI)

3.5.0
Comment 6 Timur 2016-02-08 19:18:30 UTC
Looking at other reports, it looked like Windows only problem, but I reproduced it with Linux also. 
Original font is Tahoma. LO says it's unavailable and substituted, although I have it installed in Windows. So, not sure whether related to Bug 64509.
Comment 7 Urmas 2016-02-09 11:05:01 UTC
No surprise it is displayed wrong:

{\fonttbl{\f1 Tahoma CE}}
Comment 8 Dženan Zukić 2016-02-09 13:44:02 UTC
Shouldn't that be replaced by another CE (Central European) font? Or the glyph mapping transformed onto a Unicode font (regular Tahoma is installed on my OS)?
Comment 9 Yousuf Philips (jay) (retired) 2016-10-05 04:09:07 UTC
Opening attachment 120415 [details] in word 2010 and resaving it as an rtf doesnt result in this problem.

So the issue seems to be boil down to older rtf's having font names that reference which windows code page[1] the text is encoded with and that being not understood by the rtf import. So '{\f1 Tahoma CE}' should reference the Central European Windows-1250 code page[2] which for example has ascii 230 (æ) as ć.

Unless LO stores the character mapping of windows code pages and also has a routine for conversion, i dont think this is something that could be fixed. @Miklos: Any thoughts?

[1] https://en.wikipedia.org/wiki/Windows_code_page
[2] https://en.wikipedia.org/wiki/Windows-1250
Comment 10 QA Administrators 2019-04-03 02:56:47 UTC
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug