Bug 90421 - mal-formed RTF link output causes seeming data deletion
Summary: mal-formed RTF link output causes seeming data deletion
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.7.2 release
Hardware: x86 (IA32) All
: medium major
Assignee: Miklos Vajna
URL:
Whiteboard: target:5.0.0 target:4.4.4
Keywords: bibisected, bisected, regression
Depends on:
Blocks:
 
Reported: 2015-04-02 20:09 UTC by Kev
Modified: 2021-02-20 11:38 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
long document with link on the first page (91.69 KB, application/rtf)
2015-04-02 20:09 UTC, Kev
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kev 2015-04-02 20:09:55 UTC
Created attachment 114569 [details]
long document with link on the first page

LibreOffice 4.2.7.2 build ID 420m0(Build:2) on Linux Mint 17, 32-bit.

Reproduceable bug in Writer.  If URL recognition is enabled in Tools->AutoCorrect Options, make a file with two paragraphs.  In between them, write "http://testlink/ " without quotes, but with the trailing space, so that the link is recognized.  Save in RTF format.  The resulting file, if opened by LibreOffice again, shows nothing that used to be after the link.  If opened by Wine WordPad on wine-1.6.2, it's similar, except that even the link is not shown.  Fortunately, the data IS actually still there, and if you manually correct the RTF in a text editor, you're able to see it again in LibreOffice or WordPad.

The RTF being generated for the link seems to be mal-formed, with many extra closing braces in the middle:

\par \pard\plain \s0\nowidctlpar{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\cf0\kerning1\dbch\af5\langfe2052\dbch\af6\afs24\alang1081\loch\f3\fs24\lang1033{{\field{\*\fldinst HYPERLINK "http://testlink/" }{\fldrslt {\cf2\ul\ulc0\langfe255\alang255\lang255\rtlch \ltrch\loch
http://testlink/}{}}}}}}{\field{\*\fldinst HYPERLINK }{\fldrslt {\rtlch \ltrch\loch
 }
\par \pard\plain \s0\nowidctlpar{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\cf0\kerning1\dbch\af5\langfe2052\dbch\af6\afs24\alang1081\loch\f3\fs24\lang1033\rtlch \ltrch\loch

I'm marking this as 'major' only because of the apparent effect: after typing a long document and saving several backup copies, I then closed and later reopened them only to find, in this particular case, that most of the document appeared to be missing, in every backup I made, too.  Not a pleasant surprise if you were someone or their grandpa, who might not know to look at the RTF in a text editor to troubleshoot.

Thanks,
Kev
Comment 1 A (Andy) 2015-04-02 20:25:00 UTC
Reproducible with LO 4.4.1.2, Win 8.1

Steps Done:
1. Open WRITER
2. Write "Test" in the first line and make a line break
3. Write "http://testlink/ " in the second line and make a line break
4. Write "Test 2" in the third line
5. Save it as a RTF and close it
6. Reopen it again

Result: "Test 2" is no longer visible.
Comment 2 Matthew Francis 2015-04-04 10:58:35 UTC
Works as expected on Linux / LO 3.5

-> Keywords: regression
Comment 3 Matthew Francis 2015-04-09 05:08:19 UTC
This seems to have changed at the below commit.
Adding Cc: to mstahl@redhat.com; Could you possibly take a look at this? Thanks

Following the instructions in comment 1, two things are significant: that the space after the link seems to be required, and that this no longer occurs after the same file has been round-tripped through ODF (i.e. it only occurs for me when typed into a new file and saved directly as RTF)


    commit b8907bf3d3b37c686a414ffbbd2d732348aab5b9
    Author:     Michael Stahl <mstahl@redhat.com>
    AuthorDate: Fri Jun 27 16:02:45 2014 +0200
    Commit:     Michael Stahl <mstahl@redhat.com>
    CommitDate: Fri Jun 27 16:15:19 2014 +0200
    
        fdo#78758: sw: RTF export: don't export multiple \fldrst for one hyperlink
    
        Ensure that we export only one \fldresult per hyperlink by doing that in
        StartURL() and EndURL(); the TextINetFormat() is called once per text
        portion.  This shouldn't cause problems as there can't be anything
        between the end of the \field group and the start of \fldresult anyway.
    
        Replace the annoying call to EndURL() from EndRun() with a special case
        in EndURL() to store things in the right buffer (hopefully).
    
        (somehow this is regression from c4498251cb7181a9f272b0720f398597c0daef09)
    
        Change-Id: I209ea7a384fb1cb5d1505a70ecc4a4536bbf26a2
Comment 4 Matthew Francis 2015-04-09 05:11:46 UTC
Second attempt - bugzilla claimed to have run out of memory on the first try

Added Cc: to mstahl@redhat.com - could you possibly take a look at this one? Thanks
Comment 5 Miklos Vajna 2015-04-21 14:33:55 UTC
I think the root cause is fe444d1f74abe417962be0bcd3340f40f2446b58 (fdo#62536: sw: fix AutoCorrect bold/underline on existing AUTOFMT, 2013-06-20), which makes the autocorrect create two hyperlink hints on the text node when you type "http:/s/ ". Seeing that the ODF export simply hyperlinks where the URL is empty, probably the best is to let the RTF export do the same, that will fix this bug, too.
Comment 6 Commit Notification 2015-04-21 16:44:30 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7d42346ba77c9c4df241ea40eaf550993ca18783

tdf#90421 RTF export: ignore hyperlinks without an URL

It will be available in 5.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2015-05-27 09:31:27 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dbf24ea9aa010fe51da8d580a1403c8ecd9f0b04&h=libreoffice-4-4

tdf#90421 RTF export: ignore hyperlinks without an URL

It will be available in 4.4.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Robinson Tryon (qubit) 2015-12-17 08:51:29 UTC
Migrating Whiteboard tags to Keywords: (bibisected)
[NinjaEdit]