Description: I have a presentation which is about 800kB. If I open it with LibO6.0 RC2 and save it again, the file size jumps up to about 1.3MB. This seems to be due to how images are managed. Opening the presentation file with a zip tool reveals that the LibO 6.0 version of the presentation includes multiple png "copies" of an emf image included in the presentation. Unfortunately, I cannot attach the presentation I'm working on here now. I'll try to see if I can create a reproducible test case. In the meantime, I am posting the bug in case: - someone else experiences the same issue, so that they can refer to here and maybe help providing a test case - some developer can immediately recognize what may have lead to this regression Steps to Reproduce: See description Actual Results: See description Expected Results: See description Reproducible: Always User Profile Reset: No Additional Info: [Information automatically included from LibreOffice] Locale: en-US Module: StartModule [Information guessed from browser] OS: Linux (All) OS is 64bit: yes User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
I seem to be able to reliably reproduce a difference between 5.4 and 6.0 in that 6.0 always stores a png version of any emf/wmf image inserted in a presentation, which already seems a regression to me. It looks like I am not able to reproduce the case where I had multiple versions of the png associated to the same emf image (which was used both in master pages and in the slides).
Could be it's: https://wiki.documentfoundation.org/ReleaseNotes/6.0#Improvements_to_ODF_Export "Metafiles which were previously saved in the internal SVM (Star View Metafile) format are now accompanied by a PNG fallback graphic. This makes it easier for other ODF readers to display the graphics."
Looks like a reasonable explanation for what I observed in my comment from 14 Jan (even if I'd very much prefer to have it configurable in the "compatibility" options as this means saving all vector images twice - which often make a difference from having a document suitable for being sent via email or requiring some large file attachment service). What remains unexplained is the case where I got 3 identical PNGs of the same vector image in the odp file. They went away by resaving the doc with LibO 5.4, so - unfortunately - right now I do not have a file for analysis. I suspect, but I cannot be sure, that this occurred after importing master pages in a presentation from another template, where these master pages from the other template contained the same images that were already in the presentation where they got imported. But I cannot be sure...
> What remains unexplained is the case where I got 3 identical PNGs of the > same vector image in the odp file. They went away by resaving the doc with > LibO 5.4, so - unfortunately - right now I do not have a file for analysis. Putting to NEEDINFO until a file is provided...
Here we go again. I have a file where opening the odp with a zip tool, I see two perfectly identical (same CRC) png files corresponding to the same wmf. Archive: demo.odp Length Date Time Name --------- ---------- ----- ---- 47 2018-01-15 14:38 mimetype 7205 2018-01-15 14:38 Thumbnails/thumbnail.png 1888 2018-01-15 14:38 meta.xml 12049 2018-01-15 14:38 settings.xml 33698 2018-01-15 14:38 content.xml 298760 2018-01-15 14:38 Pictures/10000201000002C3000002C376C5E25DC0676B4B.png 298760 2018-01-15 14:38 Pictures/10000201000002C3000002C3A4E810BA42055A45.png 859 2018-01-15 14:38 Pictures/1000000000000020000000204B249CA79A42C6D7.png 0 2018-01-15 14:38 Configurations2/floater/ 0 2018-01-15 14:38 Configurations2/menubar/ 0 2018-01-15 14:38 Configurations2/progressbar/ 0 2018-01-15 14:38 Configurations2/toolbar/ 0 2018-01-15 14:38 Configurations2/accelerator/current.xml 0 2018-01-15 14:38 Configurations2/statusbar/ 0 2018-01-15 14:38 Configurations2/images/Bitmaps/ 0 2018-01-15 14:38 Configurations2/popupmenu/ 0 2018-01-15 14:38 Configurations2/toolpanel/ 1603 2018-01-15 14:38 META-INF/manifest.xml 176031 2018-01-15 14:38 styles.xml 315552 2018-01-15 14:38 Pictures/1004D0A00000E7400000E74040C4C8430774A921.wmf --------- ------- 1146452 20 files Furthermore, when this happens, one of the copies of that image appearing in the presentation gets deteriorated, as if rather than using the "perfect" vector version, LibO started using just a png for it. Unfortunately, this stuff on which I am having the issue is not particularly sensitive, but includes vector versions of the template used for slides at my Institution and should not be openly shared.
Indeed, one of the images stops being an emf and becomes a png! This is evident from using the "save" function, that rather than proposing the saving of an emf now proposes the saving of a png. Specifically, I have some logo in vector form and cropped in some master pages. I also have the same logo in non cropped form in the last slide. This latter logo gets replaced by a png and, at the same time, the png file starts appearing twice in the odp. With this the matter seems more serious than I originally expected, because some document content is lost (the vector image) and replaced by a lower fidelty one (the png version of the same image).
could you please share the file ?
Created attachment 139388 [details] Sample file showing the issue Please find attached a file showing the issue. Inside the "Picture" folder, there are two png images, both 1.7kB in size, both with CRC 84D41D37, namely the same file. 1742 Stored 1742 0% 2018-01-26 21:24 84d41d37 Pictures/10000200000001090000010943CE3636A5225AEB.png 1742 Stored 1742 0% 2018-01-26 21:24 84d41d37 Pictures/100002000000010900000109E2FD4B392411EDF7.png These corresponds to the vector image 5225 Stored 5225 0% 2018-01-26 21:24 ca96b3f8 Pictures/2000001000001B5900001B59FFB0B197CBF7C754.svm This file shows the problem quite well. When a vector image is inserted in the presentation and then copied multiple times, LibO 6 in some occasions makes one png per copy, rather than making one for the each different vector image.
Confirmed. Using the builds 5.4.4.2 and 6.0.1.1 I did the following. - With 5.4.4.2 I opened the attached file and deleted sheet 2. - Saved the document as reference - I Opened the reference file in 5.4.4.2 and copied the two images (one after the other) into sheet 2 - Saved as "test_ref_5.4.4.2.odp". - I Opened the reference file in 6.0.1.1 and copied the two images (one after the other) into sheet 2 - Saved as "test_ref_6.0.1.1.odp". Result: test_ref_5.4.4.2.odp cotains one png.file. Size: 27.0 kB test_ref_6.0.1.1.odp cotains FIVE png.file. Size: 44.5 kB ==> The file size nearly doubles with 6.0.1.1! I will attach the files saved by the two different LibreOffice builds.
Created attachment 139763 [details] File saved by LibreOffice 5.4.4.2
Created attachment 139764 [details] File saved by LibreOffice 6.0.1.1
Using attachment 139763 [details], it points me to author Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de> 2018-01-12 17:32:41 +0100 committer Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de> 2018-01-15 13:50:10 +0100 commit 3da86d8987db6223b0acc5d8a1b56f7e0c54bbef (patch) tree 0afdf8c0a0497ebfd8ef1303bfc51c5a3177a4a5 parent 0623f3a8f5d6fbc5e9b933cb034184084e8ac666 (diff) tdf#114488 Rank multiple images also for flat odf Only the file extension was considered before which is not available in flat odf. Now both internal and external URLs are resolved to their respective mimetype. being before 30K and after 39K. Personally I don't consider a 9K increase a bug. I would, if the difference were from 30K to 300K ( 10 times ) or greater. Closing as RESOLVED WONTFIX
@Xisco: Thanks for the bibisect! In other cases, the size increase might be much higher. In my case it was +65 percent. Imagine files that are several mega bytes... I have added Samuel. Let's wait for his feedback. Perhaps it is easy to fix and just something he has overlooked...
Link to the patch related issue: Bug 114488
Quoting from the initial report. > I have a presentation which is about 800kB. If I open it with LibO6.0 RC2 > and save it again, the file size jumps up to about 1.3MB. This is almost 2X. I wonder if someone could clarify what does "Rank multiple images also for flat odf. Only the file extension was considered before which is not available in flat odf. Now both internal and external URLs are resolved to their respective mimetype." actually mean. In my case, I insert an svg and then copy and paste it and I get multiple identical equivalent pngs in the odf. Why should both "internal" and "external" URLs be involved? I am trying to understand what is going on, since I think that if LibO >=6 cannot get fixed, it should be possible to at least write an offline odf tool to get rid of all the duplicate figures, updating the internal references to them. I have tried keeping a LibO 5.4 around to do the roundtrip through it, to reduce file sizes, but this seems unreliable (at times figures disappear) and does more than desired (eliminates all the pngs corresponding to svgs, not just the redundant ones).
So as far as I understand there are a few issues here: 1) Image size grows because of the fallback images. I think we can consider adding an option whether to include fallback images or not. 2) Vector images are being replaced with PNGs, and the vector images are gone afterwards. Needs investigation. 3) Fallback PNGs are somethimes added twice. Also needs investigation. Related commits: https://cgit.freedesktop.org/libreoffice/core/commit/?id=6b3cc69fd2b2de5ace68f2739eb383267d66f76f https://cgit.freedesktop.org/libreoffice/core/commit/?id=38602abc2d2b59bc3644e37797b9b1bc779fd993 https://cgit.freedesktop.org/libreoffice/core/commit/?id=2d3023c9713c4c7cac732a6831c69dec581a7751 (The commit mentioned above (3da86d8987db6223b0acc5d8a1b56f7e0c54bbef) should be unrelated to this issue, as it only affects importing.
Nice summary. I'd add that a) I have a feeling that 2) might be related with round trips involving both LibO 6 and LibO 5.x. b) When roundtrips involving LibO 5.x are successfull, 3) is often solved. That is: take a LibO file with duplicate equivalent PNGs; open it with LibO 5.4.x; save (all the equivalent PNGs are discarded); open with LibO 6.x; save (equivalent PNGs are re-created, typically not in duplicate fashion).
Serge Krot committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=79b2f1cb36ea4fec61b0620085313eb53fce9fa0 tdf#115005 Do not remove original vector images from slides It will be available in 6.1.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Serge Krot committed a patch related to this issue. It has been pushed to "libreoffice-6-0": http://cgit.freedesktop.org/libreoffice/core/commit/?id=070f3db51da48c70cde12050c18fb03de2192c0f&h=libreoffice-6-0 tdf#115005 Do not remove original vector images from slides It will be available in 6.0.4. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Serge Krot committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=1c1160967acf49cffae8921f3ab8361821bbaaaf tdf#115005: New option to prevent adding fallback images It will be available in 6.1.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 115898 has been marked as a duplicate of this bug. ***
This issue is still reproducible in master Version: 6.1.0.0.alpha0+ Build ID: cb5f6503f593d7c7a719542281b9efd274134f7c CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); Calc: group Let's keep bug 117074 as a follow-up bug...
(In reply to Commit Notification from comment #18) > Serge Krot committed a patch related to this issue. > It has been pushed to "master": > > http://cgit.freedesktop.org/libreoffice/core/commit/ > ?id=79b2f1cb36ea4fec61b0620085313eb53fce9fa0 > > tdf#115005 Do not remove original vector images from slides > Sorry, Serge Krot, but seems your changes are unrelated to description: https://cgit.freedesktop.org/libreoffice/core/commit/?id=79b2f1cb36ea4fec61b0620085313eb53fce9fa0 and https://cgit.freedesktop.org/libreoffice/core/commit/?id=070f3db51da48c70cde12050c18fb03de2192c0f&h=libreoffice-6-0 talks only about SVG, and SVG as image/x-vclgraphic . https://www.openoffice.org/api/docs/common/ref/com/sun/star/graphic/GraphicDescriptor.html says "internal mime type image/x-vclgraphic, in which case the original mime type is not available anymore" https://www.openoffice.org/api/docs/common/ref/com/sun/star/graphic/GraphicDescriptor.html indicate that many vector images have their own mimetypes, e.g. image/svg+xml image/x-emf image/x-eps image/x-wmf . Strange, but I don't see PDF, though it is supported. I open new bug for discarded PDF as images from ODT document (though not slides): https://bugs.documentfoundation.org/show_bug.cgi?id=117576