Bug 101458 - export PDF makes the size of the PDF twice bigger because images PNG are not compressed
Summary: export PDF makes the size of the PDF twice bigger because images PNG are not ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
5.1.5.2 release
Hardware: All All
: medium normal
Assignee: Marco Cecchetti
URL:
Whiteboard: target:5.3.0.1 target:5.4.0 target:5.2.4
Keywords:
: 103910 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-08-11 19:10 UTC by Barto
Modified: 2016-12-13 14:33 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
test case showing regression between 4.2 and the new beta (827.16 KB, application/gzip)
2016-11-29 13:44 UTC, Danny
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Barto 2016-08-11 19:10:42 UTC
I use libreoffice 5.1.5.2 under archlinux 64 bits,

I notice a bug for the function "export PDF" in writer, the size of the generated PDF is twice bigger than usual, it's seems that the PNG images are not compressed even if the user has set the option "JPEG compression" with a ratio in % in the dialog box for PDF export,

it's probably a bug in the source code related to the feature "export PDF" in libreoffice, when a document contains images  it seems that the feature "export PDF" forgets to compress the image, the generated PDF is then too big in size
Comment 1 Aron Budea 2016-08-11 19:28:29 UTC
Please monitor duplicate bug 99723 for updates.

*** This bug has been marked as a duplicate of bug 99723 ***
Comment 2 Danny 2016-11-28 18:27:21 UTC
Can confirm this bug on the 5.3.0beta1. 

Either bug 99723 is not fixed (but I think it is, the difference is just that that bug was fixed by looking at recompressing jpeg files only), or this was not a dupe after all.

How to reproduce:
1) Create a single slide with one inserted png image.
2) Export to pdf with 60% lossy compression 
3) Export to another pdf with lossless compression.
4) Observe that file sizes are identical
5) Compare images in the pdf:
$pdfimages -list test2.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    1152   591  rgb     3   8  image  no         4  0   141   141 24.0K 1.2%

$ pdfimages -list test1.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    1152   591  rgb     3   8  image  no         4  0   141   141 24.0K 1.2%
Comment 3 Julien Nabet 2016-11-28 21:14:36 UTC
I don't know if png should be recompressed.
However, I noticed this patch https://cgit.freedesktop.org/libreoffice/core/commit/?id=eab3c3ab9da5f0282df43d2f4bfbf17f7a4f8fe3

If I comment these lines:
if ( bIsPng || ( aSizePixel.Width() < 32 ) || ( aSizePixel.Height() < 32 ) )
    bUseJPGCompression = false;

(or if "bIsPng" test is removed)
see  http://opengrok.libreoffice.org/xref/core/vcl/source/gdi/pdfwriter_impl2.cxx#163

Also, depending on the quality, I made a quick test (on pc Debian x86-64 with master sources updated today):
- with 60%, the file size is bigger than without compression
- with 30%, the file size is less big 

-rw-r--r-- 1 julien julien  80101 nov.  28 22:04 test_loss30.pdf
-rw-r--r-- 1 julien julien 104462 nov.  28 22:04 test_loss60.pdf
-rw-r--r-- 1 julien julien 101328 nov.  28 22:01 test_lossless.pdf
-rw-r--r-- 1 julien julien 125998 nov.  28 21:45 test.odp

Michael/Marco: any thoughts here?
Comment 4 Danny 2016-11-29 13:17:03 UTC
>I don't know if png should be recompressed.

I guess that is the key question. But in my use case (many slides with a lot of embedded png and svgs) I assumed that keeping the Impress file high quality and future proof validated not converting every single image to jpg (if there is a global option to do this, whatever pdf export is doing would be less important)

For distribution to others, an export to lower quality pdf is very useful. And I assumed that is what this option is there for. And if there is an option that says it will recompress images in a lossy format I expect it to do exactly that....

On average my files are 5-10x bigger compared to last year.

For very small images I agree that png or jpg does not matter much (and png is probably better). But simply ignoring png when selecting I want a lossy pdf does not sound right either. It seems the patch you linked is related to bug 97662.


I do not completely understand your test result (in my case, it seemed selected lossy did not do anything). So I am curious if you could do a check with pdfimages.
Comment 5 Danny 2016-11-29 13:44:00 UTC
Created attachment 129121 [details]
test case showing regression between 4.2 and the new beta

Added a test case. Note that I am not talking about very small pngs. Large ones are simply ignored when selecting lossy in pdf export.
Comment 6 Julien Nabet 2016-11-29 14:04:29 UTC
(In reply to Danny from comment #4)
> >I don't know if png should be recompressed.
> 
> ...
> I do not completely understand your test result (in my case, it seemed
> selected lossy did not do anything). So I am curious if you could do a check
> with pdfimages.
I made the test with the patch I quoted so by reenabling jpeg compression.

Now, we could also think since option is called "JPEG compression" and the tested file contains "png" images (and not "jpeg" images) that it's perhaps expected.
Comment 7 Danny 2016-11-29 14:56:43 UTC
Thanks for the clarification! I now also understand my confusion:

if ( bIsPng || ( aSizePixel.Width() < 32 ) || ( aSizePixel.Height() < 32 ) )

I actually thought this said "bIsPng &&", since I would find that logical (also because of the parenthesis)....

Anyway...lets wait for somebody to decide what is supposed to be the correct behaviour of JPEG compression. I see what you are getting at with "JPEG compression", but if it is not supposed to recompress all images, I guess I should make a RFE for a global recompression option; just for recompressing already compressed jpgs on export, I can see little real life use (I would personally only use that option to sacrifice quality for size).
Comment 8 Danny 2016-11-30 10:08:37 UTC
*** Bug 103910 has been marked as a duplicate of this bug. ***
Comment 9 Commit Notification 2016-11-30 11:53:58 UTC
Marco Cecchetti committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dbdf91f93ee975b1b9b6bfc6d9571e13ca0e750e&h=libreoffice-5-3

tdf#101458 - check PNG for adequate compression

It will be available in 5.3.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2016-11-30 11:56:49 UTC
Marco Cecchetti committed a patch related to this issue.
It has been pushed to "libreoffice-5-2":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1fd5c8080c47e75fff4aa377540ced29142da146&h=libreoffice-5-2

tdf#101458 - check PNG for adequate compression

It will be available in 5.2.5.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2016-11-30 11:58:39 UTC
Marco Cecchetti committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fed7de57d23b43f0a0cb2bcf5f0fbefe5852de2e

tdf#101458 - check PNG for adequate compression

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Aron Budea 2016-12-04 18:56:44 UTC
Marco, could you also backport it to branch 5.2.4?
Comment 13 Julien Nabet 2016-12-04 22:04:11 UTC
(In reply to Aron Budea from comment #12)
> Marco, could you also backport it to branch 5.2.4?

The patch is now in review here: https://gerrit.libreoffice.org/#/c/31605/1
Comment 14 Commit Notification 2016-12-13 11:33:17 UTC
Marco Cecchetti committed a patch related to this issue.
It has been pushed to "libreoffice-5-2-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0c937f5d692477371bf2fe367a710f0899e36c33&h=libreoffice-5-2-4

tdf#101458 - check PNG for adequate compression

It will be available in 5.2.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Julien Nabet 2016-12-13 14:33:10 UTC
targets cleaning