Created attachment 105410 [details] ZIP-File with the files mentioned in the text I converted a PDF to a DOCX-file by using SolidConverter, which could be opened by WRITER successfully inspite of some misformatting issues. (attachment: GB_2013.docx) I did not changed anything but saved the file under a different name (here: GB_2013_b.docx), closed it and reopened the saved version. A lot of the graphic elements of the front page went lost! Well, in fact they are still available in word\media Even worse, just by saving the file LO doubled most of the images, some of the got even three addional copies. Why? By consequence the filesize of GB_2013_b.docx has been doubled, too, in comparition to the original GB_2013.docx. (attachment: LO-4-3-0-4_doubles-media-entry.PNG) Last but not least, why LO lists all media individually in content_types.xml while in the original version there is just made a default setting: GB_2013.DOCX: <Default ContentType="image/jpeg" Extension="jpeg"/> <Default ContentType="image/png" Extension="png"/> GB_2013_b.DOCX saved by WRITER: <Override ContentType="image/jpeg" PartName="/word/media/image24.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image25.jpeg"/> <Override ContentType="image/png" PartName="/word/media/image22.png"/> <Override ContentType="image/png" PartName="/word/media/image21.png"/> <Override ContentType="image/png" PartName="/word/media/image19.png"/> <Override ContentType="image/png" PartName="/word/media/image20.png"/> <Override ContentType="image/png" PartName="/word/media/image18.png"/> <Override ContentType="image/png" PartName="/word/media/image15.png"/> <Override ContentType="image/png" PartName="/word/media/image14.png"/> <Override ContentType="image/png" PartName="/word/media/image16.png"/> <Override ContentType="image/png" PartName="/word/media/image23.png"/> <Override ContentType="image/jpeg" PartName="/word/media/image13.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image12.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image9.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image11.jpeg"/> <Override ContentType="image/png" PartName="/word/media/image28.png"/> <Override ContentType="image/png" PartName="/word/media/image5.png"/> <Override ContentType="image/jpeg" PartName="/word/media/image8.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image10.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image26.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image1.jpeg"/> <Override ContentType="image/png" PartName="/word/media/image7.png"/> <Override ContentType="image/png" PartName="/word/media/image29.png"/> <Override ContentType="image/png" PartName="/word/media/image6.png"/> <Override ContentType="image/png" PartName="/word/media/image4.png"/> <Override ContentType="image/png" PartName="/word/media/image17.png"/> <Override ContentType="image/png" PartName="/word/media/image3.png"/> <Override ContentType="image/jpeg" PartName="/word/media/image27.jpeg"/> <Override ContentType="image/jpeg" PartName="/word/media/image2.jpeg"/>
Hello Andre, Thank you for submitting the bug. I can confirm that the bug is in master. Version: 4.4.0.0.alpha0+ Build ID: fcc6e8ae56d539ef92bfb917a52ac0638b3db25f TinderBox: Linux-rpm_deb-x86@45-TDF, Branch:master, Time: 2014-08-30_01:50:29
The docx file wasnt openable in 3.3.0, but opened in 3.5.7. When looking at the internals of the original docx file, the /word/media folder has 13 images (6 jpgs and 7 pngs). In 3.6.7 to 4.1.6, 0-byte files without extensions were being saved instead of the jpg files. In 4.2.6, two png files were duplicated when saving to the docx. In master, most images are duplicated and 2 images have four instances, resulting in 29 total images. So duplicate images likely started somewhere in the 4.2.x releases. When opening the original docx file in master, only 1 image was listed in navigator and when clicking images not listed in navigator, the graphics toolbar wouldnt appear. The document has top and bottom margins of 0 cm. When reopening the saved docx in master, 9 images are shown in navigator, with only 1 of them with a label. This document has top and bottom margins of 2.54 cm, which might be the reason why the top and bottom images of the page arent being displayed.
Created attachment 105485 [details] Open VS Save Reopen in Master
Created attachment 107314 [details] Original File Opened and Saved in Word 2013 If you first open and save GB_2013.docx in MS Word, Writer can open and save the new file without any major problems. It is likely that GB_2013.docx is not a valid OOXML file.
The docx file was created with MS Office 2007 Outlook, so opening and resaving it in Word 2013 is saving it a different version of the docx format.
This bug report probably explain my similar behavior. When saving a LibreOffice Writer document in .docx format the margins and indent settings are lost when the file is re-opened. This occurs with LibreOffice version 4.2.7.2.
The bug appears limited to the .docx format because when I correct the formatting and save the file in .doc format, close it and re-open it the margins and bullet indents formatting is retained.
The images in GB_2013.docx started being duplicated on save from the below commit. It's not clear how many of the problems round-tripping this file are directly related to this, so I'm going to leave off splitting this bug up for now, but more bugs may need to be opened once this has been dealt with. Adding Cc: to vmiklos@collabora.co.uk; Could you possibly take a look at this? Thanks commit cfb5b20cdc230320ff9f864d1cfd81aaea221da0 Author: Miklos Vajna <vmiklos@collabora.co.uk> Date: Wed Dec 18 11:03:57 2013 +0100 DocxAttributeOutput::OutputFlyFrame_Impl: enable DML export by default This was only available in experimental mode previously. Also note that export of Writer TextFrames are handled separately, there DML export is still off by default. Change-Id: Ie8eaa1670610d92a363a8558b68064e7d7de2cdd
There are indeed more images in the saved document than in the original one. Interestingly, not all images are duplicated -- and that matches my memory that for Writer images we already de-duplicate them on export when we write both the drawingML and VML markup. We need to do the same for drawinglayer images, too.
Miklos Vajna committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=b484e9814c66d8d51cea974390963a6944bc9d73 tdf#83227 oox: reuse RelId in DML/VML export for the same graphic It will be available in 5.1.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
The above fixes the duplicated images, please open a separate bug for the margins problem if that's still a problem.
(In reply to Miklos Vajna from comment #11) > The above fixes the duplicated images Thanks for the fix. Will you be able to backport it into 5.0? > , please open a separate bug for the > margins problem if that's still a problem. Submitted as bug 94009 and it is a regression.
libreoffice-5-0 backport: https://gerrit.libreoffice.org/18491
Miklos Vajna committed a patch related to this issue. It has been pushed to "libreoffice-5-0": http://cgit.freedesktop.org/libreoffice/core/commit/?id=1b381370b026f62397dc2d41ddcecf9d6523e044&h=libreoffice-5-0 tdf#83227 oox: reuse RelId in DML/VML export for the same graphic It will be available in 5.0.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Miklos Vajna committed a patch related to this issue. It has been pushed to "libreoffice-4-4": http://cgit.freedesktop.org/libreoffice/core/commit/?id=c9a290c2a87e9af3b0cd4ccbdd751dddab3532da&h=libreoffice-4-4 tdf#83227 oox: reuse RelId in DML/VML export for the same graphic It will be available in 4.4.6. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Migrating Whiteboard tags to Keywords: (bibisected, DataLoss, filter:docx) [NinjaEdit]