Bug 89405 - FILESAVE: export to MS Word 97-2003 (DOC) corrupts comment order
Summary: FILESAVE: export to MS Word 97-2003 (DOC) corrupts comment order
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.3.0.0.beta1
Hardware: All All
: medium normal
Assignee: Pieter van Oostrum
URL:
Whiteboard: target:4.5.0
Keywords:
Depends on:
Blocks:
 
Reported: 2015-02-16 00:39 UTC by Pieter van Oostrum
Modified: 2016-02-10 22:40 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Original ODT file (29.46 KB, application/vnd.oasis.opendocument.text)
2015-02-16 00:48 UTC, Pieter van Oostrum
Details
Saved DOC file that is corrupted (13.50 KB, application/msword)
2015-02-16 00:49 UTC, Pieter van Oostrum
Details
Patch for this problem (8.83 KB, patch)
2015-03-05 13:22 UTC, Pieter van Oostrum
Details
Comprehensive ODT document (10.11 KB, application/vnd.oasis.opendocument.text)
2015-03-12 13:00 UTC, Pieter van Oostrum
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pieter van Oostrum 2015-02-16 00:39:26 UTC

    
Comment 1 Pieter van Oostrum 2015-02-16 00:46:38 UTC
Saving some Writer files to MS Word 97-2003 (DOC) format mangles the comments. See the attached ODT file, where the comments are in order 
Original comment 1 + Reply to this
Comment 2
Comment 3
Comment 4
Comment 5
Comment 6
Comment 7

In the saved DOC file, when opened in MS Word, the order is:

Original comment 1 + Reply to this
Comment 3
Comment 4
Comment 2
Another copy of Original comment 1

The others have disappeared, and Comment 2 is attached to the wrong text.

Actually, the comments are still in the file, because when the DOC file is opened in LibreOffice, it looks OK. So there is a problem is the export code that is compensated by the same error in the import code.

When saving this file as DOCX or RTF it opens correctly in MS Word.
Comment 2 Pieter van Oostrum 2015-02-16 00:48:15 UTC
Created attachment 113414 [details]
Original ODT file
Comment 3 Pieter van Oostrum 2015-02-16 00:49:03 UTC
Created attachment 113415 [details]
Saved DOC file that is corrupted
Comment 4 Pieter van Oostrum 2015-02-16 15:47:42 UTC
I found that Apache OpenOffice 4.0.1 does it right. Both the file I attached to this bug report and my original, much bigger file.
Comment 5 Alex Thurgood 2015-02-25 15:04:29 UTC
Confirming with test file on Word for Mac 2011
Comment 6 Alex Thurgood 2015-02-25 15:09:33 UTC
Confirming that AOO 411 exports the comments to DOC in correct order.
Comment 7 Pieter van Oostrum 2015-02-25 22:16:59 UTC
Comments can be attached to a range of text or to a single position in the text.

I found that the comments get mangled when there is a mixture of comments on ranges and on positions. When all comments in a document are on ranges, or all on positions then they are saved properly.

Also AOO does not support comments on ranges, so that is the reason that AOO doesn't have this bug.

I compared the AOO code with the LibreOffice code (it is in sw/source/filter/ww8/wrtw8sty.cxx) and found that the code there constructed a wrong datastructure in the mixed case mentioned above. I have not yet found what the proper way is – for that I have to do some more study of the specification of the Word binary format. One option would be to treat positions as a range with the same begin and end points. I have tried that and it solves the problem, but I think it also changes the semantics. Therefore I am currently looking for a better solution.
Comment 8 Pieter van Oostrum 2015-03-05 13:22:26 UTC
Created attachment 113911 [details]
Patch for this problem

Here a patch for this problem. It solves the problem and also some other problems with exporting comments to a .doc (MS Word 97-200) file that I encountered during testing.
I have tested this patch in LO 4.1.1.2.
Comment 9 Pieter van Oostrum 2015-03-05 14:56:39 UTC
That should have been:
(In reply to Piet van Oostrum from comment #8)

> I have tested this patch in LO 4.1.1.2.

That should have been LO 4.4.1.2
Comment 10 Pieter van Oostrum 2015-03-12 12:57:11 UTC
Summary of the problems this patch solves:

1) when there is a mixture of comments on ranges and comments on a point, the comments get mixed up
2) when there are nested and overlapping comments the begin and end points can get mixed up
3) when two or more comments are on ranges that have the same starting point (e.g. when one is a reply to an other one), only one keeps its starting point and the others only keep their end points
Comment 11 Pieter van Oostrum 2015-03-12 13:00:40 UTC
Created attachment 114053 [details]
Comprehensive ODT document

This file has all three cases in it that cause problems. The problems are only visible if the files is saved as .doc and then viewed in MS Word.
Comment 12 A (Andy) 2015-03-22 23:26:36 UTC
Buggy behaviour also reproducible with LO 4.4.1.2, Win 8.1

"Original comment 1" is shown two times as a comment box
"Comment 2" from Piet is missing
"Comment 3" and "Comment 5" seems to be interchanged
"Original comment 1" seems also to cover almost the whole text from paragraph 2 to the last paragraph
"Comment 6" and "Comment 7" from Piet are missing
Comment 13 Commit Notification 2015-03-24 10:11:35 UTC
Piet van Oostrum committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5e49b9b4e99f787071a624dadd3e587ea6b041a7

tdf#89405 DOC export: fix corrupted comment order

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Timur 2015-04-03 17:25:34 UTC
Thank you for writing about the possibility of backporting to 4.3. and 4.4.
Comment 15 Joel Madero 2016-02-10 22:40:01 UTC
Both 4.4 and 4.3 EOL - removing request.