Bug 79384 - FILEOPEN: can't open RTF "error file format in (129.7) column string" with Shift JIS encoded text
Summary: FILEOPEN: can't open RTF "error file format in (129.7) column string" with Sh...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Miklos Vajna
URL:
Whiteboard: rtf_import target:4.4.0 target:4.3.0....
Keywords: filter:rtf, notBibisectable, regression
Depends on:
Blocks:
 
Reported: 2014-05-28 18:45 UTC by cda2023
Modified: 2015-12-18 11:03 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
File that does not open (21.63 KB, application/rtf)
2014-05-28 18:45 UTC, cda2023
Details

Note You need to log in before you can comment on or make changes to this bug.
Description cda2023 2014-05-28 18:45:34 UTC
Created attachment 100044 [details]
File that does not open

LibreOffice does not open RTF format files (error message "error file format in (129.7) column string"), and in openoffice the same file opens normally.
Microsoft Word also opens with an error vordpad opens like a bud file is empty
Comment 1 Cor Nouws 2014-05-28 19:05:00 UTC
hi cda,

thanks for reporting. I can confirm that this file does not open in 4.3.0beta1.
Gives an error on position 129,7 (row, column).

I didn't try older versions.
Cheers,
Cor
Comment 2 tommy27 2014-05-28 19:10:22 UTC
confirmed under Win7x64 using LibO 4.3beta1, 4.2.4.2 and older releases till 3.5.0

file is loaded with 3.4.6. hence 3.5.x regression.
Comment 3 Joel Madero 2014-05-28 19:16:48 UTC
Seems to me like the file itself has some problems if 3 of 4 office suites have issues opening it ?
Comment 4 tommy27 2014-05-29 04:30:21 UTC
Word Viewer can't open that file. 
WordPad loads it as an empty page.
but AOO 4.1.0 loads it with no trouble like LibO 3.4.6
Comment 5 Xisco Faulí 2014-05-29 16:20:31 UTC
Can't be bibisected with 43all. Out of Range
Comment 6 Michael Stahl (CIB) 2014-06-02 16:49:26 UTC
the problem is this text which is encoded in  Shift JIS:

  {\*\cs35\snext35\hich\af5\dbch\af5\loch\f5 Mp{u y{p;}

(there are also some 8-bit characters that don't paste well into firefox)

the "{" in there are apparently read as group characters, which causes the failure.

not sure if un-encoded 8-bit chars are allowed in RTF.
Comment 7 Miklos Vajna 2014-06-02 18:07:36 UTC
Root cause is that { without a closing } is invalid inside a style name, but turns out the OOo RTF import supported that (Word does not). Let me add a workaround for this, but the real solution is to fix the generator application, needless to say.
Comment 8 Commit Notification 2014-06-02 18:30:00 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6092e2eba3f74c9632f7862b2368b0fcf7732f85

fdo#79384 RTF import: allow { without } in style names



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2014-06-02 20:20:45 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b7b3035ff14ba78ee193a0d5e83c9b028bf06b84&h=libreoffice-4-3

fdo#79384 RTF import: allow { without } in style names


It will be available in LibreOffice 4.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Michael Stahl (CIB) 2014-06-02 22:13:22 UTC
it's somewhat likely that the document is invalid:
Shift JIS is a double-byte encoding in any case, and the reason why
it's invalid is the immediately preceding \loch.

however if that \loch is replaced with \dbch, then
Word can read it; this is not only the case for a style
name, but also for body text, so the commit in comment #8
is not sufficient.

i've got a prototype Shift-JIS fix that seems to work,
will push it tomorrow.
Comment 11 Miklos Vajna 2014-06-03 07:01:51 UTC
I suggest to handle the dbch-related problem in a separate bug: that would be a valid RTF document, while my commit is a workaround to restore support for something that is invalid, but was supported by OOo traditionally.
Comment 12 Miklos Vajna 2014-06-03 07:07:45 UTC
-4-2 review: https://gerrit.libreoffice.org/9625
Comment 13 Commit Notification 2014-06-03 08:29:19 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4cd095db39f5631379245c171e82e373833b2e11&h=libreoffice-4-2

fdo#79384 RTF import: allow { without } in style names


It will be available in LibreOffice 4.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2014-06-03 18:56:26 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=061190a62fcdbfb3a0b266d5afffbd257a3e692e

fdo#79384: RTF import: fix literal Shift-JIS text



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2014-06-03 18:56:41 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d71387ca81b61416b9a7b82cd6cf67d496b81fc2

fdo#79384: replace the work-around with a different one



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2014-06-03 19:46:13 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8cd856d9705fbcd61ad902859769fc98bf6d7a69&h=libreoffice-4-3

fdo#79384: RTF import: fix literal Shift-JIS text


It will be available in LibreOffice 4.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2014-06-03 19:46:29 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1063b8e8c122a819e844a2209a7136a8a9be31fd&h=libreoffice-4-3

fdo#79384: replace the work-around with a different one


It will be available in LibreOffice 4.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2014-06-04 07:36:28 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d15eb9f09c8854bd58fecd3dc6a31fa678e392a1&h=libreoffice-4-2

fdo#79384: RTF import: fix literal Shift-JIS text


It will be available in LibreOffice 4.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2014-06-04 07:38:07 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=9428815bb8c4cf3bd67e8ac40781a59fc2161756&h=libreoffice-4-2

fdo#79384: replace the work-around with a different one


It will be available in LibreOffice 4.2.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 20 Robinson Tryon (qubit) 2015-12-17 10:57:00 UTC
Migrating Whiteboard tags to Keywords: (NotBibisectable)
[NinjaEdit]
Comment 21 Robinson Tryon (qubit) 2015-12-18 11:03:16 UTC
Migrating Whiteboard tags to Keywords: ()
Add filter:rtf.
[NinjaEdit]