Bug 101000 - FILESAVE: Images disappear when DOC saved to ODT
Summary: FILESAVE: Images disappear when DOC saved to ODT
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.1.0.3 release
Hardware: All All
: medium normal
Assignee: Caolán McNamara
URL:
Whiteboard: target:5.4.0 target:5.3.1 target:5.2.7
Keywords: bibisected, bisected, regression
: 102508 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-07-19 05:17 UTC by Paul
Modified: 2017-02-16 21:39 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Simplified source doc which triggers the problem (87.71 KB, application/x-zip-compressed)
2016-07-19 05:17 UTC, Paul
Details
Patch for proposed fix (1.10 KB, patch)
2017-02-15 12:53 UTC, Alex
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul 2016-07-19 05:17:52 UTC
Created attachment 126292 [details]
Simplified source doc which triggers the problem

The simplified DOC in the attached zip can be opened in Libre Office and it looks fine.  When saved as ODT then the ODT is opened, several images are not visible (no trace at all).  If the saved ODT file is opened with Libre Office 5.0.2.2 and earlier the images are visible.

This is not a problem in release 5.0.2.2 but is present in 5.1.0.3.  It is still a problem with the current release 5.1.4.2.  The attachment contains screen shots showing what the document looks like when saved and then reopened in these versions of Libre Office.

The release notes between 5.0.2.2 and 5.1.0.3 refer to changes to do with image naming.  This appears to be related.  The failing document imports images with names like "Picture 4" and several images have the same name (eg draw:name="Picture 4").  If the saved ODT content.xml is manipulated to name the images uniquely, they all appear when opened in Libre Office.

Related: The source doc was created as follows:
 a) an image was copied from another DOC file using Word into the source DOC
 b) the copied image is then duplicated (copy paste) in Word to create 4 copies.
Comment 1 Paul 2016-07-19 09:28:14 UTC
5.0.6.3 (current "Still" release) does not have the problem either.
Comment 2 MM 2016-07-19 20:09:36 UTC
Confirmed with v5.2.0.2 under ubuntu 16.04 x64.
Unconfirmed with v5.0.6.3 under mint 17.3 x64.

Importance should be set a bit higher as it affects data loss with a native format.
Comment 3 MM 2016-07-19 20:35:37 UTC
After looking at the data from both of my saved files, I realized that it's not a data loss issue at all. All data is there, on newer versions maybe even a bit more. Think the problem is in the import, as I can load the file saved with 5.2 in v5.0.6 without a problem, but not vice verse.
Comment 4 Paul 2016-09-13 03:31:01 UTC
The "still" release (5.1.5.2) that is now the current release has this problem.
This regression in the still version is unfortunate.
Comment 5 Xisco Faulí 2016-09-13 08:21:38 UTC
Adding keyword 'bibisectRequest'
Comment 6 raal 2016-09-23 14:30:08 UTC
This seems to have begun at the below commit.
Adding Cc: to Caolán ; Could you possibly take a look at this one?
Thanks

author    Caolán McNamara <caolanm@redhat.com>    2015-11-11 13:34:43 (GMT)
committer    Caolán McNamara <caolanm@redhat.com>    2015-11-19 09:31:00 (GMT)
commit    de0432a9256188c7b5cd1a83858311e68c890ebf (patch)
tree    0f71c62f403b1fe81915ce336a8ad9be7e8446df
parent    526bbbbd2f8eb227bc0dacd755a6c72511adf976 (diff)
Incredible slowness and crashes with document with vast num of frame dups
 36819a8a88302af3a9b955eb7190aad1ef571a38 is the first bad commit
commit 36819a8a88302af3a9b955eb7190aad1ef571a38
Author: Norbert Thiebaud <nthiebaud@gmail.com>
Date:   Thu Nov 19 06:59:42 2015 -0800

    source de0432a9256188c7b5cd1a83858311e68c890ebf
git bisect log
# bad: [05d11632892a322664fb52bac90b2598b7fb7544] source 5616d22b57a9a5e57d545e912e029162a230829b
# good: [c1efd324c6ad448ac9edb030dc9738b9e6899e4d] source ab465b90f6c6da5595393a0ba73f33a1e71a2b65
git bisect start 'origin/master' 'oldest'
# good: [97526ab777da7e58ce283c05498262ecdd4d6f7f] source 4ea70f87f7a2b61eda6e5ab1f48debf6fcfadc1f
git bisect good 97526ab777da7e58ce283c05498262ecdd4d6f7f
# good: [86fee7ded76d9c2756ccab6aef160a2d7fab0ab6] source 1b62841b1859ae3443e2bf1ebe99ec3d6afb6cc2
git bisect good 86fee7ded76d9c2756ccab6aef160a2d7fab0ab6
# good: [11864a7db429a57aeea021e0b3f1fb1412282d32] source e5b721a14c1c8e5261a70588b30353cbb5bd55c6
git bisect good 11864a7db429a57aeea021e0b3f1fb1412282d32
# good: [7d52a87c0aa24498584ec522705cfae3a3a5a038] source 479df22d0b4b0e0393fcf621e7380b38415bcef8
git bisect good 7d52a87c0aa24498584ec522705cfae3a3a5a038
# bad: [bea538a879f50238f4c9c6f05e3d7390db9d76c7] source 7289a140fc68dc898ba2b2357cc960968195f236
git bisect bad bea538a879f50238f4c9c6f05e3d7390db9d76c7
# bad: [ad146f48b7f50d159d5b96f1c118cdb8412a98b8] source 91cbbb7797f048834b51690e9fab60aa778b1e44
git bisect bad ad146f48b7f50d159d5b96f1c118cdb8412a98b8
# bad: [e19c6163b0c6f5c6618cefd870a31522957fb620] source ff522704109078a0cde844c74d608137b7c70f42
git bisect bad e19c6163b0c6f5c6618cefd870a31522957fb620
# bad: [cb343d9d1273b0981a35f9629a32aace88bc0609] source bb2ee8c2b550186e48ca5f069dcf8a9d69d65729
git bisect bad cb343d9d1273b0981a35f9629a32aace88bc0609
# good: [dee525d2053607f9651075bead2f32ca32d6c40b] source 526bbbbd2f8eb227bc0dacd755a6c72511adf976
git bisect good dee525d2053607f9651075bead2f32ca32d6c40b
# bad: [47b0f459600b21f862998451a23173b53ace80c8] source 8311c6ed4970d22e7a6459fa7ed2779560e5e11d
git bisect bad 47b0f459600b21f862998451a23173b53ace80c8
# bad: [f224c781b4257dcfcb431d5928f8d54d0b2fdf78] source c94cf0cf5f10edb45a74a58c95c306b0d271645b
git bisect bad f224c781b4257dcfcb431d5928f8d54d0b2fdf78
# bad: [179ce594887adf39f57a3d04abe4c7df0c0a7a85] source 5319def848e855068512f0f895086ff7a1f9e44f
git bisect bad 179ce594887adf39f57a3d04abe4c7df0c0a7a85
# bad: [36819a8a88302af3a9b955eb7190aad1ef571a38] source de0432a9256188c7b5cd1a83858311e68c890ebf
git bisect bad 36819a8a88302af3a9b955eb7190aad1ef571a38
# first bad commit: [36819a8a88302af3a9b955eb7190aad1ef571a38] source de0432a9256188c7b5cd1a83858311e68c890ebf
Comment 7 Xisco Faulí 2016-09-25 20:54:32 UTC
*** Bug 102508 has been marked as a duplicate of this bug. ***
Comment 8 Xisco Faulí 2016-09-26 14:53:47 UTC
Adding Cc: to Caolán McNamara
Comment 9 mk01 2016-11-07 11:45:12 UTC
See also test data in bug 102508 (that was marked as duplicate to this).
Comment 10 Alex 2017-02-15 12:51:44 UTC
I did some digging and testing and the bit of code that relates to the bisected commit is actually useless. Infact, if that block of code was working as expected, the image elements in the attached document should be renamed as:

- Picture 0
- Picture 01
- Picture 02
- Picture 03

But they end up being named as:

- Picture 0
- Image2
- Image3
- Image4

Removing altogether the rename logic from the XMLTextFrameContext_Impl::Create method in xmloff/source/text/XMLTextFrameContext.cxx, preserves the same behaviour without causing the regression.
Also, not trying to perform rename, it actually addresses the original performance concern that introduced the regression.
Comment 11 Alex 2017-02-15 12:53:50 UTC
Created attachment 131245 [details]
Patch for proposed fix

Patch that removes the renaming logic
Comment 12 Xisco Faulí 2017-02-15 13:02:53 UTC
(In reply to Alex from comment #11)
> Created attachment 131245 [details]
> Patch for proposed fix
> 
> Patch that removes the renaming logic

Hi Alex,
Could you please update the patch to gerrit ? https://wiki.documentfoundation.org/Development/gerrit/SubmitPatch
Comment 13 Caolán McNamara 2017-02-15 14:29:02 UTC
if you load the .doc in writer and then go to the navigator you'll see under Images 4 "Picture 0" elements listed. If you rename one to "thing" and try and rename another to "thing" LibreOffice won't let you do that. So it still looks to me that these names need to be unique per category and that the .doc importer should enforce unique graphic names
Comment 14 Commit Notification 2017-02-15 15:05:13 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=432f605e3287269d1a20383f4eeebf012ee3679d

Resolves: tdf#101000 ensure unique image names in .docs

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Caolán McNamara 2017-02-15 15:08:04 UTC
that addresses this specific issue on doc saved to .odt and reloaded, i.e. at .doc import time affecting .odt export so reimport is fine. Doesn't hurt to still submit the more general issue with the odt import and duplicate frame names and see what mstahl things about that.

backports in gerrit for 5-3 and 5-2
Comment 16 Kevin Suo 2017-02-16 00:56:00 UTC
Any plan to backport the fix to 5.2 and 5.3 branch?
Comment 17 Caolán McNamara 2017-02-16 10:43:35 UTC
See comment #15 "backports in gerrit for 5-3 and 5-2"
Comment 18 Commit Notification 2017-02-16 21:37:41 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=7b6bf8348f36e5b98c5a6d5ddc4815df07f4fbbc&h=libreoffice-5-3

Resolves: tdf#101000 ensure unique image names in .docs

It will be available in 5.3.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2017-02-16 21:39:11 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=94037ee2e6b85006373d1c4cecfcb32f4c82bb9e&h=libreoffice-5-2

Resolves: tdf#101000 ensure unique image names in .docs

It will be available in 5.2.7.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.