Bug 104479 - Export as PDF produces much larger PDFs
Summary: Export as PDF produces much larger PDFs
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.1.6.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
: 105045 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-12-07 21:34 UTC by Steve Edmonds
Modified: 2017-05-23 20:51 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
PDF of historical size. (2.16 MB, application/pdf)
2016-12-07 21:34 UTC, Steve Edmonds
Details
PDF from 5.2.4 (7.40 MB, application/pdf)
2016-12-07 21:35 UTC, Steve Edmonds
Details
the file being used in the transformation into pdf (1.45 MB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2017-05-08 14:43 UTC, Douglas C. R. Paes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Edmonds 2016-12-07 21:34:12 UTC
Created attachment 129381 [details]
PDF of historical size.

PDF files produced with 5.2.4 are much larger than on 5.1.6 or 5.2.3.3.
At image resizing to 300dpi a 6MB PDF is now 10.5MB under 5.2.4
At image resizing to 150dpi the 6MB file became 2.2MB with 5.1.6 and the 10.5MB file became 7.4MB with 5.2.4.
Comment 1 Steve Edmonds 2016-12-07 21:35:47 UTC
Created attachment 129382 [details]
PDF from 5.2.4

PDF of increased size.
Comment 2 Steve Edmonds 2016-12-07 21:37:54 UTC
Full version information.
Version: 5.2.4.1
Build ID: 20m0(Build:1)
CPU Threads: 4; OS Version: Linux 3.16; UI Render: default; VCL: kde4; 
Locale: en-NZ (en_US.UTF-8); Calc: group
Comment 3 Steve Edmonds 2016-12-07 21:45:18 UTC
Writer file https://drive.google.com/open?id=0ByFEFUXgJhGkZ0ZQekRrY1dqeG8
Comment 4 Steve Edmonds 2016-12-07 22:24:31 UTC
Also noticed now in Version: 5.2.3.3, Build ID: 20m0(Build:3)
Comment 5 MM 2016-12-07 22:29:20 UTC
Possible dup of bug 101458.
Comment 6 Steve Edmonds 2016-12-07 23:09:58 UTC
Could be, but I don't notice it in 5.1.6. When 5.2.4 is released with fix to bug 101458 I can check.
Comment 7 thackert 2016-12-10 16:08:09 UTC
Hello Steve, *,
thank you very much for reporting this bug :) I can reproduce it with

OS: Debian Testing AMD64
LO: Version: 5.2.3.3
Build-ID: 1:5.2.3-2
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; VCL: x11; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group
(Debian's own version)

LO: Version: 5.2.3.3
Build-ID: d54a8868f08a7b39642414cf2c8ef2f228f780cf
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group

LO: Version: 5.2.4.1
Build-ID: 9b50003582f07ac674d6451e411e9b77cccd2b22
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; VCL: gtk2; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group

LO: Version: 5.2.3.1
Build-ID: 01ec8f357e651ca9656837b783cf7e6a32ee4d92
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group

LO: Version: 5.2.0.4
Build-ID: 066b007f5ebcc236395c7d282ba488bca6720265
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE.UTF-8)
(the last ones all parallel installed, following the instructions from https://wiki.documentfoundation.org/Installing_in_parallel/Linux)

but not in

LO: Version: 5.1.6.2
Build-ID: 07ac168c60a517dba0f0d7bc7540f5afa45f0909
CPU-Threads: 4; BS-Version: Linux 4.5; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: group

so setting the status to "NEw" and changing the version to 5.2.0.4.
Comment 8 raal 2016-12-11 14:20:46 UTC
Hello,

Thank you for submitting the bug. The bug has previously been reported, so this bug will be added as a duplicate of it. You will automatically be CCed to updates made to the other bug.

*** This bug has been marked as a duplicate of bug 101563 ***
Comment 9 Paddy Landau 2016-12-11 15:03:52 UTC

*** This bug has been marked as a duplicate of bug 99723 ***
Comment 10 Paddy Landau 2016-12-11 15:04:33 UTC
This is not a duplicate of bug 101563, which is for something else, but for 99723.
Comment 11 Steve Edmonds 2016-12-11 18:17:12 UTC
(In reply to Paddy Landau from comment #9)
> 
> *** This bug has been marked as a duplicate of bug 99723 ***

This is not a duplicate of Bug 99723 - Setting image Compression in PDF export does not result in smaller file size.

Changing the compression in my file does reduce image size.
Comment 12 Steve Edmonds 2016-12-11 18:19:07 UTC
(In reply to Paddy Landau from comment #10)
> This is not a duplicate of bug 101563, which is for something else, but for
> 99723.

Correct, not a duplicate of bug 101563, 101563 states linked images, mine are all embedded.
Comment 13 Steve Edmonds 2016-12-11 18:26:59 UTC
Comment 11 should read "Changing the compression in my file does reduce file (PDF) size.
Comment 14 Steve Edmonds 2016-12-13 19:11:58 UTC
This has been a progressive issue, delving further back and may be an accumulation of multiple bugs.
Using the same file (from comment 3), 90% compression, reduced 300dpi image resolution.
LO 5.0.6.3. PDF 2.5MB
LO 5.1.6.2.0+ PDF 6.3MB
LO 5.2.4.1 PDF 11MB
Comment 15 Julien Nabet 2016-12-13 20:45:47 UTC
On pc Debian x86-64 with master sources updated today (so it includes https://cgit.freedesktop.org/libreoffice/core/commit/?id=b7f92a21a458fc6fa68894fbc881eda0a1e8325e), here are the results I get:
-rw-r--r-- 1 julien julien 22909072 déc.  13 21:41 PT252-PT253manual.odt
-rw-r--r-- 1 julien julien  7563159 déc.  13 21:43 PT252-PT253manual.pdf

I tested with 90% compression, reduced 300dpi image resolution
Comment 16 Steve Edmonds 2016-12-13 21:24:19 UTC
Thanks, as I get the same results I quoted on Linux (X64) and windows (X32), this probably means the commit for 101563 doesn't fix this bug.
Comment 17 Paddy Landau 2016-12-14 08:43:12 UTC
My results are as follows.

LO version  Compression   Size (Mb)
5.1.6.2     Lossless        28.3
5.2.3.3     Lossless        28.3
5.4.0.0     Lossless        28.3

5.1.6.2     90%, 300 dpi     6.4
5.2.3.3     90%, 300 dpi    11.0
5.4.0.0     90%, 300 dpi     7.6
Comment 18 Steve Edmonds 2016-12-14 18:25:25 UTC
Thanks, I think that confirms this bug is still present as we should be getting a 2.5MB file.
Comment 19 Aron Budea 2017-01-04 06:12:22 UTC Comment hidden (bibisection)
Comment 20 Aron Budea 2017-01-04 06:32:46 UTC
This traces back to the same commit as bug 99723. Adding Cc: to Michael Meeks.

https://cgit.freedesktop.org/libreoffice/core/commit/?id=76ec54e8c9f3580450bca85236a4f5af0c328588

author	Michael Meeks <michael.meeks@collabora.com>	2016-02-08 14:24:15 (GMT)
committer	Michael Meeks <michael.meeks@collabora.com>	2016-02-09 00:09:08 (GMT)

tdf#97662 - Try to preserve original compressed JPEGs harder.


The file in question contains 106 JPG/PNG images, ~22 MB altogether, but 20 images of size 0.2 to 2 MB make up almost all of that size (and 5 of those are PNGs).

The fix to bug 101458 is responsible for this change in size (in 5.2.4.2 it's roughly the same as in 5.4.0.0 below):

(In reply to Paddy Landau from comment #17)
> 5.2.3.3     90%, 300 dpi    11.0
> 5.4.0.0     90%, 300 dpi     7.6
Comment 21 raal 2017-01-08 13:59:18 UTC
*** Bug 105045 has been marked as a duplicate of this bug. ***
Comment 22 clubchef 2017-01-26 18:47:55 UTC
When will this Bug (PDF Problem) be fixed?!
In LO 5.2.5 it is unhappily still present.
Comment 23 Aron Budea 2017-01-26 21:05:24 UTC
Clubchef, there's no ETA, but if it's causing you trouble, you could install an earlier version separately from the current one, and use that for exporting to PDF (5.0.6 and 5.1.1 are free from this regression). Details and download links are available here:
https://wiki.documentfoundation.org/Installing_in_parallel
Comment 24 Xisco Faulí 2017-03-23 10:57:05 UTC
*** Bug 106627 has been marked as a duplicate of this bug. ***
Comment 25 Douglas C. R. Paes 2017-05-08 14:43:51 UTC
Created attachment 133160 [details]
the file being used in the transformation into pdf
Comment 26 Douglas C. R. Paes 2017-05-08 15:01:55 UTC
We are facing problems trying to export PPTX files into PDF.

All the files are attached as links because of the attachments limit of 10 MB

The original file (https://drive.google.com/open?id=0B2d7BMp8tlURbHdZNGJicjNJdG8) is 1,5 MB.

The LibreOffice generated PDF (https://drive.google.com/open?id=0B2d7BMp8tlURNWQ0bGQ4YW9ONTQ) is 82,5 MB

The same PPTX converted into PDF by Microsoft Office (https://drive.google.com/open?id=0B2d7BMp8tlURX1lsMERlYlNyMkU) is 10,7 MB

Besides the size problem, the other problem we have is the CPU usage that goes 100% for  much time.

In a MAC with 16 GB of RAM and 4 cores, it took 8 minutes to finish (in the fastest try)
In another machine, with 16 GB or RAM and 8 cores, it took 5 minutes.
In our server, which is a Ubuntu 16.04 also with 16 GB or RAM and 8 cores, it takes more than 10 minutes (this server runs other services, all intensive in the resources usage, that is why it is slower).

I hope the provided files help in the problem investigation.

Let me know if you guys need more information from me.
Comment 27 Paddy Landau 2017-05-08 17:57:12 UTC
@Douglas C. R. Paes
There is the obvious question: are you exporting from MS and LO with the same settings, i.e. image compression and reduction?

I attempted this on Linux Ubuntu 16.04 (64-bit), with image compression 80% and size reduction to 150dpi.

Not only did the CPU hit 100% (one CPU at a time, which is to be expected), but also the RAM hit 100% with the swap file hitting 2Gb. There is clearly something wrong.
Comment 28 Telesto 2017-05-23 20:51:37 UTC
(In reply to Douglas C. R. Paes from comment #26)

> Besides the size problem, the other problem we have is the CPU usage that
> goes 100% for  much time.
I created a new report (bug 108038) for this one and also for comment 27 (bug 108037 because LibO is also crashing because of the memory usage)