Bug 104707 - FILESAVE DOCX: URL in comment is not saved to docx format
Summary: FILESAVE DOCX: URL in comment is not saved to docx format
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:6.1.0
Keywords: dataLoss, filter:docx
: 108308 113750 (view as bug list)
Depends on:
Blocks: Hyperlink DOCX-Comments
  Show dependency treegraph
 
Reported: 2016-12-16 09:32 UTC by Gabor Kelemen
Modified: 2019-08-02 08:23 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Test document with URL in comment (5.97 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-12-16 09:32 UTC, Gabor Kelemen
Details
Screenshot of the document in LO 5.1.4 and Word 2013, and odt and doc versions in LO (216.06 KB, image/png)
2016-12-16 09:32 UTC, Gabor Kelemen
Details
Original document in odt (17.37 KB, application/vnd.oasis.opendocument.text)
2016-12-16 09:33 UTC, Gabor Kelemen
Details
The same document saved in doc (13.50 KB, application/msword)
2016-12-16 09:34 UTC, Gabor Kelemen
Details
URLinComment2003.docx: MS Word 2003 created test with URL in comment (11.31 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2018-01-04 15:30 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Kelemen 2016-12-16 09:32:13 UTC
Created attachment 129682 [details]
Test document with URL in comment

I created a simple document with a comment and an URL in the comment.

When saved to DOCX and reopened, the URL is removed from the comment both in LO 5.1.4 and Writer 2013.

When saved to DOC or ODT the URL is present after reopening in LO.
Comment 1 Gabor Kelemen 2016-12-16 09:32:55 UTC
Created attachment 129683 [details]
Screenshot of the document in LO 5.1.4 and Word 2013, and odt and doc versions in LO
Comment 2 Gabor Kelemen 2016-12-16 09:33:20 UTC
Created attachment 129684 [details]
Original document in odt
Comment 3 Gabor Kelemen 2016-12-16 09:34:29 UTC
Created attachment 129685 [details]
The same document saved in doc
Comment 4 Xisco Faulí 2016-12-16 12:57:39 UTC
Confirmed in

- Version: 5.4.0.0.alpha0+
Build ID: 634589b340316ba64b731b4d923c1056be415494
CPU Threads: 4; OS Version: Linux 4.8; UI Render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group

the import part was implemented in this commit:

author	Oliver-Rainer Wittmann <orw@apache.org>	2013-12-19 18:50:58 (GMT)
committer	Miklos Vajna <vmiklos@collabora.co.uk>	2014-01-08 14:58:35 (GMT)
commit	0761f81643a6890457e9ef7d913ab5c88c2593a4 (patch)
tree	91bf122795dfac3f9263942ab3c5dee2b4ecea26
parent	df002e39f7518036ae1c1d2afec7a525ef902327 (diff)
123792: complete annotations on text ranges feature
- rely annotations on text ranges on new annotation marks
- support arbitrary text ranges for annotations
- fix undo/redo regarding annotations an text ranges
- support annotations on overlapping text ranges
- fix *.docx import for annotations on overlapping text ranges
- fix ODF import of annotations on text ranges
Comment 5 stragu 2017-02-10 13:25:11 UTC
Confirmed with the following:

Version: 5.2.5.1
Build ID: 1:5.2.5~rc1-0ubuntu1~trusty0
CPU Threads: 2; OS Version: Linux 3.13; UI Render: default; VCL: gtk2; 
Locale: en-GB (en_GB.UTF-8); Calc: group

Note that the URL has to be formatted as a URL to disappear in the docx. If it is plain text, it will stay. If it is formatted as a URL, which happens automatically when continuing the editing after it (write a space, or press return, etc.), it will disappear if saved as a docx.
Comment 6 Xisco Faulí 2017-09-13 18:09:44 UTC
Removing the 'bibisected' keyword. it's not a regression
Comment 7 Gabor Kelemen 2017-11-13 12:05:19 UTC
*** Bug 108308 has been marked as a duplicate of this bug. ***
Comment 8 Yousuf Philips (jay) (retired) 2017-11-14 13:24:47 UTC
From bug 108308 comment 4.

So the problem is that the hyperlink text isnt being exported into the <w:t> tag. How LO exports it

<w:hyperlink r:id="rId1">
  <w:r>
    <w:rPr>
      <...>
    </w:rPr>
  </w:r>
</w:hyperlink>

How MS exports it

<w:hyperlink r:id="rId1" w:history="1">
  <w:r>
    <w:rPr>
      <...>
    </w:rPr>
    <w:t>AAAA</w:t> <!-- line that is missing in LO export -->
  </w:r>
</w:hyperlink>

Justin: thoughts?
Comment 9 Justin L 2018-01-03 11:32:37 UTC
(In reply to Yousuf Philips (jay) from comment #8)
> Justin: thoughts?

DocxAttributeOutput::RawText (this=0x2816000) at /persistent/git/libreoffice/sw/source/filter/ww8/docxattributeoutput.cxx:2504
{
    SAL_INFO("sw.ww8", "TODO DocxAttributeOutput::RawText( const String& rText, bool bForceUnicode, rtl_TextEncoding eCharSet )" );
}
Comment 10 Justin L 2018-01-04 15:30:54 UTC
Created attachment 138876 [details]
URLinComment2003.docx: MS Word 2003 created test with URL in comment

This .docx shows that even after fixing the missing URL, LO will not import it as a clickable URL.  Instead, it really will be just raw text.

From what I can tell from following the .doc import code, this should import as nWhich 51 (RES_TXTATR_INETFMT). But in .docx, the attempt to apply the "HyperLinkURL" property is causing an exception in PopFieldContext(), so the hyperlink info is lost on import.

Anyway, I have something hacky that works for exporting.
proposed RawText implementation: https://gerrit.libreoffice.org/47380
Comment 11 Commit Notification 2018-01-05 08:06:28 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=94fc02ddbdd5aaef701af9963f74050aed75468d

tdf#104707 ooxmlexport: support RawText

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Justin L 2018-01-05 17:54:26 UTC
(In reply to Justin L from comment #10)
> This .docx shows that even after fixing the missing URL, LO will not import
> it as a clickable URL.  Instead, it really will be just raw text.

separate bug 114854 created for this, since I failed to solve it.
Comment 13 Justin L 2018-01-24 13:38:33 UTC
Some other documents that use rawtext differently (in shapes) are:
-linktest.odt  (attachment 54029 [details]) from bug 43431
-The-Secrets--Website-Analytics-Brochure(v3w)-bugreport.odt  (attachment 42596 [details]) from bug 33596
Comment 14 Commit Notification 2018-01-28 13:33:53 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c0fc7910f1bfb7159f6fd7022dfd838bfb66b624

tdf#104707 ooxmlexport: support RawText in textboxes

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Gabor Kelemen 2019-08-02 08:23:38 UTC
*** Bug 113750 has been marked as a duplicate of this bug. ***