Created attachment 126857 [details] Sample document and image SUMMARY When exporting a document with a linked image to PDF, the resulting PDF file is huge. In my tests, it varies between 6 and 10 times larger than required. STEPS TO REPRODUCE 1. Create a document with at least one linked image. (A sample document and image are provided in the attachment.) 2. Export to PDF using lossless export. WHAT IS EXPECTED • The exported PDF should be a size commensurate with the original document and linked images. WHAT HAPPENS INSTEAD • The PDF file is many times larger. • In my tests, PDF files are between 6 and 10 times larger than from previous LibreOffice versions. • In the attached sample, the PDF for previous versions is consistent at 2.1 Mb, but from version 5.2.0 it is ten times larger at 21 Mb. MORE INFORMATION • Tested on versions 5.0.6, 5.1.4, 5.1.5 and 5.2.0. — Versions 5.0.6, 5.1.4 and 5.1.5 all work correctly. — Version 5.2.0 has this bug. • This is a regression, which used to happen a long time ago but was fixed. I haven't tested on versions prior to 5.0.6 because I don't know how to obtain those older versions. • This might seem minor, but it is significant when the resulting PDF file greatly exceeds 100 Mb (instead of the expected 20 Mb), the PDF file is a website download, and there are many PDF files. • An expected workaround might be to use embedded images instead of linked images, but that is not a sensible option when the images can change.
On pc Debian x86-64 with master sources udpated today, I could reproduce this. I noticed this on console when loading odt file: warn:legacy.osl:6272:1:sw/source/core/graphic/ndgrf.cxx:596: Cannot swap in graphic
@Julien Nabet, thank you for confirming this. I don't get any error on the console, except for a segfault when closing LibreOffice: terminate called after throwing an instance of 'com::sun::star::uno::DeploymentException' But that has nothing to do with the document or PDF.
Hi Paddy, So you basically say that in some previous version an exported PDF linked to images and did not insert them as such. Sure? (Isn't this the problem from bug 99723?)
@Cor, no, the final PDF contains the files. What I'm saying is that the Export to PDF somehow expands the file sizes dramatically when saving to PDF. Versions 5.0.6 to 5.2.0 do not do this. Bug 99723 looks similar. It may be the same, but your experience was not as dramatic as mine, where the sizes were at least six times larger.
Let's put back to NEW since the bug has been confirmed.
(In reply to Paddy Landau from comment #4) > Bug 99723 looks similar. It may be the same, but your experience was not as > dramatic as mine, where the sizes were at least six times larger. I tested with 3.3.0.4 Difference image compressed / not compressed is 259 kB <> 9MB So definitely a huge difference. I set as duplicate of 99723 *** This bug has been marked as a duplicate of bug 99723 ***
@Cor — this is not a duplicate of bug 99723. Bug 99723: File size is not reduced with required compression, but acts as if lossless was specified. Bug 101563: File size is made 6–10 times larger when asking for lossless, doing the opposite of compression. (It's as if the JPG files were converted to BMP.) -------------------------- In the sample given in this bug, the original files are • ODT 0.1 Mb • image 2 Mb • total 2.2 Mb (discrepancy due to rounding). The PDF from LO 5.1.5.2 is correct at 2.1 Mb, whereas the PDF from 5.2.0.4 is an astonishing 20.6 Mb. Using compression 90% instead of lossless, the sample from bug 99723 results in an acceptable 2.1 Mb from both LO 5.1.5.2 and 5.2.0.4. -------------------------- So, this is not the same problem as reported in bug 99723.
Issue introduced in range 28ac7d0f0cea9067d7faba3b72a164729df26e5d..c58655c5a221d986fa3c3eed2f28810269205721
I confirm that LO 5.2.2.2 (released today) still has the bug.
Hi Paddy, Any change to test a daily build? thanks
@Cor Nouws I've just tried running the latest Daily, but unfortunately it crashes on startup on my machine. I'll try again tomorrow with a new build.
@Cor Nouws I have tested with today's version (28-Oct-2016 04:19), and unfortunately it still has the same problem. LibreOfficeDev 5.3.0.0.alpha1 eb07ae8fc52378d9b59bcb6a7df8bb022b8b9cc0
I'm seeing an issue with PDF size too, this may be a different bug though? With the same source file using jpg compression at 90% and 600dpi I get a file three times larger than I used to. -rw-rw-rw- 1 jsm jsm 11709486 Nov 12 22:46 MS3baseV30_Hardware-1.4-2015-10-12-lo5.0.0.5.pdf -rw-rw-rw- 1 jsm jsm 36890076 Nov 12 22:55 MS3baseV30_Hardware-1.4-2015-10-12-lo5.1.2.2.pdf -rw-rw-rw- 1 jsm jsm 37589447 Nov 12 22:47 MS3baseV30_Hardware-1.4-2015-10-12-lo5.2.2.2.pdf
@James Murray I don't know if this bug is related. Could it be related to bug 99723? It might be worth your while to test the development version 5.3.
Yes, it seems a duplicate of bug 99723. I can no longer reproduce it in Version: 5.3.0.0.alpha1+ Build ID: fef32a42c8bd8fd640d6c9cdc2f839fb43ad490c CPU Threads: 4; OS Version: Linux 4.8; UI Render: GL; VCL: gtk3; Layout Engine: new; Locale: ca-ES (ca_ES.UTF-8); Calc: group *** This bug has been marked as a duplicate of bug 99723 ***
Please do not mark this as a duplicate of bug 99723. It is for something quite different.
Why is it different? in Version: 5.3.0.0.alpha1+ Build ID: fef32a42c8bd8fd640d6c9cdc2f839fb43ad490c CPU Threads: 4; OS Version: Linux 4.8; UI Render: GL; VCL: gtk3; Layout Engine: new; Locale: ca-ES (ca_ES.UTF-8); Calc: group the output file size is 89,2 kB (89215 bytes)
@Xisco Fauli — see comment #7.
even though you say it's not a duplicate of bug 99723, it can be closed as RESOLVED WORKSFORME as it's no longer reproducible in Version: 5.3.0.0.alpha1+ Build ID: fef32a42c8bd8fd640d6c9cdc2f839fb43ad490c CPU Threads: 4; OS Version: Linux 4.8; UI Render: GL; VCL: gtk3; Layout Engine: new; Locale: ca-ES (ca_ES.UTF-8); Calc: group
@Xisco Fauli — I have just downloaded the latest development version: 14 November 2016 LibreOfficeDev 5.3.0.0.alpha1 Linux Ubuntu 16.04 64-bit: Build 2559ab66fd2976df54fc7d66bac5b7c0f7c23370 Windows 10 64-bit: Build c5f5b3e5334c52502c1de28828a44ad469c68850 I am still getting this error on both Linux and Windows 10. Did you check with embedded images or linked images? You can try the sample that I attached to the initial report. Again, see comment #7 for details. Reopening.
I could also still reproduce it with the same build as Xisco's (home built, but the same commit). The file size of the exported PDF of the attached sample is ~20MB. Version: 5.3.0.0.alpha1+ Build ID: fef32a42c8bd8fd640d6c9cdc2f839fb43ad490c CPU Threads: 4; OS Version: Windows 6.1; UI Render: default; Layout Engine: new; Locale: hu-HU (hu_HU); Calc: CL
This seems to have begun at the below commit. Adding Cc: to Noel Grandin; Could you possibly take a look at this one? Thanks author Noel Grandin <noel@peralex.com> 2016-04-13 09:30:11 (GMT) committer Noel Grandin <noel@peralex.com> 2016-04-13 11:27:53 (GMT) commit 19b34c0039c6293f9b37aa70f8055aa2be28ba09 (patch) tree 04463a78141cd94ee70cd463ba7687993410c276 parent fe8896bab01ccb595c993e54866a01f554b54f4f (diff) loplugin:passstuffbyref in svtools e46c94bf8b05440ece9d69b09c253d9aab6d4f6b is the first bad commit commit e46c94bf8b05440ece9d69b09c253d9aab6d4f6b Author: Norbert Thiebaud <nthiebaud@gmail.com> Date: Fri Apr 22 22:59:28 2016 -0700 source 19b34c0039c6293f9b37aa70f8055aa2be28ba09 git bisect log # bad: [6380ca07b05f68dedcaa379302cfe1fa478571c4] source 60b74fe1775e647545d2da1fcc58a4c63ec18aa5 # good: [1f670510f08cb800cbae2a1dd6ea70d3542e4721] source 49c2b9808df8a6b197dec666dfc0cda6321a4306 git bisect start 'origin/master' 'oldest' # good: [38f37b8ec1a2d199bb957cfd2581df7d1b273b74] source c0da1080b61a1d51654fc34fdaeba373226065ff git bisect good 38f37b8ec1a2d199bb957cfd2581df7d1b273b74 # good: [11ae494d8c566f23e0ef84ba0cc25fb1388b67f7] source 470cfa9860232ab70e017e6084d80f80d469555c git bisect good 11ae494d8c566f23e0ef84ba0cc25fb1388b67f7 # bad: [ee4cfd75d2452b8c416b4ec27358f7a905d6f5cf] source aa544a002e534a313ad9dd365e80f052789d9963 git bisect bad ee4cfd75d2452b8c416b4ec27358f7a905d6f5cf # bad: [c59865b07f405048acae57452454009f8bc50235] source b477a9e0b620a5e1c709e404c5a4e816ef5794f1 git bisect bad c59865b07f405048acae57452454009f8bc50235 # bad: [d23917903409b837fede67cc707378f23af45806] source d9508c82330ffce6b20fb7ed13c7bcc01f298053 git bisect bad d23917903409b837fede67cc707378f23af45806 # bad: [f0f1ed701513ccddfe6e05c054a5c2172651d941] source b8eb2946511ce617323b13dffe2b1d9704e0be60 git bisect bad f0f1ed701513ccddfe6e05c054a5c2172651d941 # good: [416a6423bc982d3e9b86f5966ad3d23debe8fd85] source 32102b9aa75a296b99f3fdaf370bd83bfd629f4e git bisect good 416a6423bc982d3e9b86f5966ad3d23debe8fd85 # bad: [352ff855ee0cee178f1b605421ae6d35fee32c46] source 9a31442171cf8bd79574c318d91ef220ee7389bb git bisect bad 352ff855ee0cee178f1b605421ae6d35fee32c46 # good: [393a7d3fd679779c61bdfcee1ee0e6d1ca04d5fb] source 299d938bf05faf60b848a9d4862e58bb42db3e65 git bisect good 393a7d3fd679779c61bdfcee1ee0e6d1ca04d5fb # bad: [01622b95dbce9171197721077f3a710a76891a9d] source ebe94af4eca68360c99f3421f1298f94747de003 git bisect bad 01622b95dbce9171197721077f3a710a76891a9d # good: [20952a1b02c5a66c575eab2a20950876187b8c5f] source 523036daaddf466eee46183bbec9a71d45c48a41 git bisect good 20952a1b02c5a66c575eab2a20950876187b8c5f # bad: [e46c94bf8b05440ece9d69b09c253d9aab6d4f6b] source 19b34c0039c6293f9b37aa70f8055aa2be28ba09 git bisect bad e46c94bf8b05440ece9d69b09c253d9aab6d4f6b # good: [e5c4e40d209dd676441a205320dfd0bd68a331d4] source fe8896bab01ccb595c993e54866a01f554b54f4f git bisect good e5c4e40d209dd676441a205320dfd0bd68a331d4 # first bad commit: [e46c94bf8b05440ece9d69b09c253d9aab6d4f6b] source 19b34c0039c6293f9b37aa70f8055aa2be28ba09
@raal reverting that commit didn't fix this problem for me. Are you sure you didn't hit a range of commits when bibisecting?
(In reply to Noel Grandin from comment #23) > @raal reverting that commit didn't fix this problem for me. > > Are you sure you didn't hit a range of commits when bibisecting? Hello Noel, retested again, repo ~/bibisect-win32-5.2 $ git checkout e46c94bf8b05440ece9d69b09c253d9aab6d4f6b Checking out files: 100% (21872/21872), done. Previous HEAD position was 1f67051... source 49c2b9808df8a6b197dec666dfc0cda6321a4306 HEAD is now at e46c94b... source 19b34c0039c6293f9b37aa70f8055aa2be28ba09 bug is here $ git checkout HEAD~1 Checking out files: 100% (83/83), done. Previous HEAD position was e46c94b... source 19b34c0039c6293f9b37aa70f8055aa2be28ba09 HEAD is now at e5c4e40... source fe8896bab01ccb595c993e54866a01f554b54f4f bug is not here. So bibisect should be correct.. repo ~/bibisect-win32-5.2 is max repo, 1 result contain only 1 commit.
Is this a windows only bug? Having trouble finding a point in time where this __works__ on Linux and I've gone back 500 revisions from e46c94bf8b05440ece9d69b09c253d9aab6d4f6b
@Noel Grandin — No, both Windows and Linux have the bug. I don't have access to a Mac, so I can't test the Mac version. LO versions 5.0.6, 5.1.4, 5.1.5 and 5.1.6.2 definitely all work correctly. The bug was originally present in an old version — I don't recall which one, unfortunately — and has recurred starting with version 5.2.0.
(In reply to Noel Grandin from comment #25) > Is this a windows only bug? Having trouble finding a point in time where > this __works__ on Linux and I've gone back 500 revisions from > e46c94bf8b05440ece9d69b09c253d9aab6d4f6b I've not reproduced the bug on Linux..
If I revert this line: const OUString& GetLink() const { return maLink; } http://opengrok.libreoffice.org/xref/core/include/svtools/grfmgr.hxx#392 to: OUString GetLink() const { return maLink; } ...the exported PDF returns to its normal size. Quite interesting.
*** Bug 104479 has been marked as a duplicate of this bug. ***
Please note that 104479 is a duplicate of 99723, not of this bug, which is for sometime else (please see comment #7). I shall mark the bugs as appropriate.
Bug 104479 does not seem a duplicate of either 104479 or 104479 but may be related. In my instance I have no linked images. Changing the compression does reduce file size.
(In reply to Aron Budea from comment #28) > If I revert this line: > const OUString& GetLink() const { return maLink; } > http://opengrok.libreoffice.org/xref/core/include/svtools/grfmgr.hxx#392 > > to: > OUString GetLink() const { return maLink; } > > ...the exported PDF returns to its normal size. Quite interesting. On pc Debian x86-64 with master sources updated today, I could reproduce the initial pb. I tested the revert but it doesn't change anything. (I must recognize I just runned "make svl.build"). Do you confirm the effect of this change on your pc?
(In reply to Julien Nabet from comment #32) > (In reply to Aron Budea from comment #28) > > If I revert this line: > > const OUString& GetLink() const { return maLink; } > > http://opengrok.libreoffice.org/xref/core/include/svtools/grfmgr.hxx#392 > > > > to: > > OUString GetLink() const { return maLink; } > > > > ...the exported PDF returns to its normal size. Quite interesting. > > On pc Debian x86-64 with master sources updated today, I could reproduce the > initial pb. I tested the revert but it doesn't change anything. (I must > recognize I just runned "make svl.build"). > Do you confirm the effect of this change on your pc? Oups, this file is called at many places, not just svl. I must run "make" at root.
(In reply to Julien Nabet from comment #33) > Oups, this file is called at many places, not just svl. I must run "make" at > root. Yes, it's called from a lot of places. As mentioned in comment 28, this reversion fixed PDF export for me. I wanted to look into it further, because it's a very peculiar issue, but compilation took quite some time, and I haven't had the opportunity.
(In reply to Aron Budea from comment #34) > (In reply to Julien Nabet from comment #33) > > Oups, this file is called at many places, not just svl. I must run "make" at > > root. > > Yes, it's called from a lot of places. As mentioned in comment 28, this > reversion fixed PDF export for me. I wanted to look into it further, because > it's a very peculiar issue, but compilation took quite some time, and I > haven't had the opportunity. So long to build it seems like building from scratch. I give up this one. It's far too long for just testing this small change that I don't know how it can impact the pb.
Julien, no worries, thanks for giving it a try. I tested, and could reproduce the bug with 5.3beta2 / Ubuntu 16.04 (so, in Linux). Noel, for reproduction please note that the settings have to be changed as shown in the first image (set to Lossless compression, remove checkbox from Reduce image resolution).
I'm going to push a patch with Aron's suggested change. Would someone mind bibisecting this one on Linux? - I suspect we may have more than one cause here. Thanks.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=b7f92a21a458fc6fa68894fbc881eda0a1e8325e tdf#101563 - Export to PDF with linked images creates huge PDF files. It will be available in 5.4.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I have tested today's current version, and I'm pleased to report that the bug has been fixed. Version: 5.4.0.0.alpha0+ Build ID: 08fa2e9307c9e4a49e18ecb0b4e9461492122fe3 Thank you, everyone who helped.
If you still have Version: 5.4.0.0.alpha0+ installed, can I ask if you could please test against the file in comment 3 of bug 104479 to see the impact. I have noticed the PDF size progressively growing from 5.0.6.3 to 5.2.4.1 2.5MB=>6.3MB=>11MB. 90% compression, resize images to 300dpi, only export bookmarks checked.
Noel Grandin committed a patch related to this issue. It has been pushed to "libreoffice-5-3": http://cgit.freedesktop.org/libreoffice/core/commit/?id=871d610bc9b162ae68b263d857cf4168d124d180&h=libreoffice-5-3 tdf#101563 - Export to PDF with linked images creates huge PDF files. It will be available in 5.3.0.1. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Backport on 5.3 branch: https://gerrit.libreoffice.org/#/c/31973/1 Backport on 5.2 branch, on review: see https://gerrit.libreoffice.org/#/c/31974/ Let's put this one to FIXED now.
I have not delved this far forward in LO development before. If I wanted to check the effect of the backport to the 5.3 branch, when would that be built and where. Would it be in http://dev-builds.libreoffice.org/daily/libreoffice-5-3/Win-x86@62-merge-TDF/current/ which has a 5 day old build of 5.3.0.0.beta1.
(In reply to Steve Edmonds from comment #43) > I have not delved this far forward in LO development before. If I wanted to > check the effect of the backport to the 5.3 branch, when would that be built > and where. > Would it be in > http://dev-builds.libreoffice.org/daily/libreoffice-5-3/Win-x86@62-merge-TDF/ > current/ which has a 5 day old build of 5.3.0.0.beta1. In general you must wait for 24/48 hours (for Linux build). Sometimes it can be longer (it seems so for Win builds). You can check if the daily build includes the commit by checking "build id" of daily build. Eg: if you go to http://dev-builds.libreoffice.org/daily/libreoffice-5-3/Win-x86_64@62-TDF/current/ There's buildinfo txt: libreoffice-5-3~2016-12-08_16.10.30_build_info.txt Reading first lines, you'll find this: core:7f47d68c4310b8bae09286a81036a6fa669a1705 Now, if you go to this url to have all the commits of 5.3 branch: https://cgit.freedesktop.org/libreoffice/core/log/?h=libreoffice-5-3 1) Change list entry from "log msg" to "range" 2) In the blank area, copy paste 7f47d68c4310b8bae09286a81036a6fa669a1705 3) Click "Search" button => you'll see the last commit included in the build. Here, it's https://cgit.freedesktop.org/libreoffice/core/commit/?h=libreoffice-5-3&id=871d610bc9b162ae68b263d857cf4168d124d180 Perhaps there's a faster way but I don't know it.
I think I found where this is coming from. I put a breakpoint in the mentioned GetLink() function, which was called here during opening the file: pSwGrfNode->SetGraphic(aGrf, rGrfObj.GetLink()); http://opengrok.libreoffice.org/xref/core/sw/source/core/docnode/swbaslnk.cxx#166 Then I followed a few levels deeper: void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& rLink ) { SetGraphic( rGraphic ); maLink = rLink; } The problem here is that maLink and rLink are the same, and SetGraphic( rGraphic ) clears maLink, so the link is lost. I'd change the line above to this: pSwGrfNode->SetGraphic(aGrf, OUString(rGrfObj.GetLink())); (maybe with a comment mentioning it's intentional, since rGrfObj is coming from pSwGrfNode a couple of lines earlier) I tested this particular change in Windows with the previous version of GetLink() (returning reference), and it fixed the size of PDF export for me in Windows 7. Noel, would you mind updating the fix?
Maybe if we changed it to detect self-assignment? void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& rLink ) { // avoid self-assignment, because SetGraphic clears maLink if ( rGraphic != this.maGraphic && rLink != this.maLink) { SetGraphic( rGraphic ); maLink = rLink; } }
Actually that should be void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& rLink ) { // avoid self-assignment, because SetGraphic clears maLink if ( rGraphic != this.maGraphic || rLink != this.maLink) { SetGraphic( rGraphic ); maLink = rLink; } }
@Steve Edmonds re comment #40 I have posted my results on bug #104479 comment #17.
Noel Grandin committed a patch related to this issue. It has been pushed to "libreoffice-5-2": http://cgit.freedesktop.org/libreoffice/core/commit/?id=e8b9fb81685db158b8b1285b2de627573a31ed76&h=libreoffice-5-2 tdf#101563 - Export to PDF with linked images creates huge PDF files. It will be available in 5.2.5. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to Noel Grandin from comment #47) > void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& > rLink ) > { > // avoid self-assignment, because SetGraphic clears maLink > if ( rGraphic != this.maGraphic || rLink != this.maLink) > { > SetGraphic( rGraphic ); > maLink = rLink; > } > } You're right that this seems to be the best place to deal with this potential issue. I wouldn't combine the two conditions, though, if somehow the first is true (so rGraphic != this.maGraphic), but the second is false (so rLink == this.maLink, or rather &rLink == &this.maLink), the bug is still triggered. This might never happen with the current surrounding code, but can it be ruled out completely? How about this: void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& rLink ) { // avoid self-assignment, because SetGraphic clears maLink if (rGraphic == this.maGraphic) { maLink = rLink; } else if (&rLink != &this.maLink) { SetGraphic( rGraphic ); maLink = rLink; } else { OUString rLinkCopy; rLinkCopy = rLink; SetGraphic( rGraphic ); maLink = rLinkCopy; } } (I haven't tested the code)
Good point Aron. Something like this is probably simpler: void GraphicObject::SetGraphic( const Graphic& rGraphic, const OUString& rLink ) { // in case we are called from a situation where rLink and maLink are the same thing, // we need a copy because SetGraphic clears maLink OUString sLinkCopy = rLink; SetGraphic( rGraphic ); maLink = sLinkCopy; }
Much simpler indeed. Looks good to me.
Aron/Noel: would one of you have a little time to submit a patch to gerrit with the change discussed in the last comments? (I could do it too if you want, just tell me :-))
Noel Grandin committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=24fa5d0570b997cc92f1fdf412f517f8d4021207 better fix for tdf#101563: Export to PDF creates huge PDF files It will be available in 5.4.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I have installed yesterday's release 5.2.5.1. I'm pleased to say that this bug has been fixed! Thank you, everyone who played a part in fixing this bug.
I tested this today on 5.3.0.3, and again I'm pleased to report that it has also been fixed here. I think that this bug can be marked fixed. Thank you again.
Thank you for your feedback Paddy. Let's put this one to FIXED then.