Bug 119634 - PDF export with lossless compression without reduce resolution saves a very erratic size
Summary: PDF export with lossless compression without reduce resolution saves a very ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
(earliest affected) release
Hardware: All All
: medium normal
Assignee: Not Assigned
Keywords: bibisected, bisected, filter:pdf, regression
Depends on:
Blocks: PDF-Export Image-DPI Regressions-imageHandling
  Show dependency treegraph
Reported: 2018-09-01 15:45 UTC by DM
Modified: 2022-02-08 13:14 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:

Result of export to PDF (20.17 KB, image/png)
2018-09-03 19:31 UTC, Roman Kuznetsov
Document with the problem (13.63 MB, application/vnd.oasis.opendocument.text)
2018-09-24 17:33 UTC, Buovjaga
Unusual but consistent size (8.83 MB, application/vnd.oasis.opendocument.text)
2019-12-04 12:00 UTC, DM

Note You need to log in before you can comment on or make changes to this bug.
Description DM 2018-09-01 15:45:41 UTC
In the following document (16Mb) -
If you open it and click the PDF button (images saving losslessly) to save it numerous times in a row (each time overwriting the last), each time checking the resulting file size, the resulting PDF sizes are hugely erratic.
Something is not right :)
Cheers, David

Steps to Reproduce:
Open file, save as PDF several times noting resulting PDF sizes

Actual Results:
150 Mb
14 Mb
75 Mb
166 Mb
52 Mb
151 Mb
50 Mb
147 Mb

Expected Results:
A fixed size roughly the size of the document

Reproducible: Always

User Profile Reset: No

Additional Info:
Version: (x64)
Build ID: efb621ed25068d70781dc026f7e9c5187a4decd1
CPU threads: 2; OS: Windows 10.0; UI render: default; 
Locale: en-GB (en_GB); Calc: group threaded
Comment 1 Roman Kuznetsov 2018-09-03 19:31:52 UTC
Created attachment 144646 [details]
Result of export to PDF
Comment 2 Roman Kuznetsov 2018-09-03 19:32:19 UTC
don't repro in

Версия: (x64)
ID сборки: 2718b4a18dfcc6a54ebe5f7b801ee7a47fa81e0c
Потоков ЦП: 4; ОС:Windows 10.0; Отрисовка ИП: по умолчанию; 
Локаль: ru-RU (ru_RU); Calc: CL
Comment 3 Buovjaga 2018-09-24 17:33:57 UTC
Created attachment 145141 [details]
Document with the problem
Comment 4 Buovjaga 2018-09-24 17:37:10 UTC
Tried 4 times, but size is always 8,6 MiB

Maybe try using LibreOffice in Safe mode, Help - Restart in safe mode and then Continue in safe mode (or launch from the Windows Start menu entry)

Change back to UNCONFIRMED, if the problem persists. Change to RESOLVED WORKSFORME, if the problem went away.

Arch Linux 64-bit
Build ID: 8b1501d80dc9d3f42c351c6e026fa737e116cae5
CPU threads: 8; OS: Linux 4.18; UI render: default; VCL: gtk3_kde5; 
Locale: fi-FI (fi_FI.UTF-8); Calc: threaded
Built on 23 September 2018
Comment 5 QA Administrators 2019-09-13 02:51:07 UTC Comment hidden (obsolete)
Comment 6 QA Administrators 2019-10-14 02:29:01 UTC Comment hidden (obsolete)
Comment 7 DM 2019-10-14 12:44:41 UTC
Hi, there are no reproducible steps I can give because it is an erratic bug, but it happens frequently, and has just happened in the document I've just saved. My first PDF was over 300 Mb, I saved again and it was 30 Mb. As you'll have seen in my initial comments, precisely the same sequence of actions will produce different outcomes.
I think you need to recognise that there are a good many very serious erratic bugs in LibreOffice for which no reproducible steps can be given, yet which occur regularly, and closing such bugs is very unhelpful: what's needed is a status that you can set that marks it as a serious but erratic so that it can be kept a watch for and in mind as people constantly work on the code.
As to the system it's just occurred on, it is -
Version: (x64)
Build ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU threads: 2; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-GB (en_GB); UI-Language: en-GB
Calc: threaded
I can say that the PDF is more likely to come out correctly after closing and opening the program and exporting it with no action on the file, and that even a slight edit of a single word before exporting may cause the PDF to be 300Mb rather than 30Mb. But equally it may not.
Many thanks,
Comment 8 Buovjaga 2019-10-14 12:51:58 UTC
I guess some script could be created that saves over and over again and monitors the resulting file size
Comment 9 DM 2019-10-14 12:55:41 UTC
What would really help with tracking such bugs as these - and in fact most bugs - would be to have a 'recorder' feature on LibreOffice, that a user having a problem can switch on so that it makes a copy of the file and then records all the actions and steps they make, and when a problem occurs they can send you the file that was opened with the action record that then happened to it, which may be 500 actions, but it may have been the 10th action of the 500 that generated the problem, the programmers could step through and watch for it to happen using a halving method (apply the first 250 steps to see if the problem was caused, then 125 steps to the side of that (i.e. 125 or 375 steps) etc, then 63, 31, 15 etc) which would quickly catch how just about every problem occurred.
Comment 10 Buovjaga 2019-10-14 13:01:25 UTC
(In reply to DM from comment #9)
> What would really help with tracking such bugs as these - and in fact most
> bugs - would be to have a 'recorder' feature on LibreOffice

There actually is such a thing: https://wiki.documentfoundation.org/Development/UITests#Tools_for_writing_a_test

But I'm not sure if it would help in this case, if the command sequence is the same all the time.
Comment 11 DM 2019-10-14 13:18:34 UTC
Well I gave in my comment 2018-09-01 15:45:41 a file which was 'reproducible' i.e. when opened it saved as erratic sizes, and I recall others were having the problem. Obviously not everyone found the same result. I should probably see what that file now does on the current version, but clearly the basic problem still exists.
The general problem I have is that serious bugs get closed merely because it's hard to reproduce an issue, as if closing a bug somehow made it go away. They need to stay alive in some way in order to be watched out for, because programmers will notice things as they program if they have in mind a list of serious bugs that can't be reproduced. I do find LibreOffice constantly mangles up the contents of so many of my files, typically ones involving tables with pictures and drawing items in, but I do persist with it because I need to be able to use the lossless PDF export, and I hope these serious bugs may get sorted as it's a good project.
Comment 12 DM 2019-12-04 12:00:56 UTC
Created attachment 156291 [details]
Unusual but consistent size

On a related note, the attached document is 9047 K but saves seemingly consistently as a PDF (set to lossless images) as 13666 K.
Generally I find a lossless save (when it doesn't suffer from the vast random bloatation mentioned in the thread) is roughly the size of the PDF, so this is an anomaly, one I've noticed before on a few documents.
This occurs on the latest beta as well as 6.3.x.
(Now the random PDF blotation mentioned in the thread I will check out for on the 6.4 line. I've been circumventing that issue by closing libreoffice and reopening afresh which usually results in a PDF size similar to the ODT rather than eg 10x the size, but it's best to be able to export to PDF without having to close and reopen.)

Version: (x64)
Build ID: 4d7e5b0c40ed843384704eca3ce21981d4e98920
CPU threads: 2; OS: Windows 10.0 Build 18362; UI render: default; VCL: win; 
Locale: en-GB (en_GB); UI-Language: en-GB
Calc: threaded
Comment 13 Timur 2019-12-04 17:19:45 UTC
Looks like I reproduced different sizes for lossless export without reduce resolution. Lo 6.5+.
Sizes were: 16 MB, 161, 147, 128, 147.
With Reduce resolution set to 300, I had the same size of 78 MB.
With Jpg compression 90%, I had the same size od 16 MB. 

With Lo 6.0, all sizes are same, 16 MB. So as reported, seems started from 6.1.
Comment 14 DM 2019-12-04 22:57:40 UTC
Thanks Timur!
I am thinking from the 16Mb reference you could be referring to the original example in Comment 1 (in contrast to Comment 12)
Comment 15 Timur 2019-12-05 07:34:19 UTC Comment hidden (obsolete)
Comment 16 Timur 2020-03-16 08:49:33 UTC
I previously tested attachment 145141 [details] from comment 1 in Windows and reproduced from 6.1.
I now tested in Linux. It's mostly 13,7 MB on 1st export but size is erratic if repeated few times (on 2nd or 3rd or 4th). So best to try 5 times.

Question: can we set images resolution in PDF export headless mode? 
If not, that's a good candidate for a new bug.
 3b744b816e7f6291eb9e4bf87b6d920f8cd35ecb is the first bad commit
commit 3b744b816e7f6291eb9e4bf87b6d920f8cd35ecb
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Tue May 8 02:30:57 2018 +0200

    source c6cf2320d2a464594e759289c34796538d31f02b

commit 1b2ea80a9faf52e9b1b6312a25e646674425ef0f (HEAD, refs/bisect/good-1b2ea80a9faf52e9b1b6312a25e646674425ef0f)
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Tue May 8 02:07:23 2018 +0200

    source ecf50fe71596c3edba8ce437481ab80ae1cd8935

Single commit:

commit c6cf2320d2a464594e759289c34796538d31f02b	[log]
author	Tomaž Vajngerl <tomaz.vajngerl@collabora.co.uk>	Fri Apr 27 18:30:45 2018 +0900
committer	Tomaž Vajngerl <quikee@gmail.com>	Tue May 08 02:25:14 2018 +0200
tree 2f00c66eada2d30ed58d45836b1a75dc4a5f257d
parent ecf50fe71596c3edba8ce437481ab80ae1cd8935 [diff]

config entries for the new graphic manager, deprecate old entries

Add 2 new GraphicManager config entries GraphicMemoryLimit and
GraphicAllowedIdleTime. At the same time, deprecate the existing
config entries used in GraphicObject's GraphicManager, which are
not relevant anymore.

Change-Id: Idb775e5e1a623f6c23d0c67fea5334a6c713c6c2
Reviewed-on: https://gerrit.libreoffice.org/53561
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Tomaž Vajngerl <quikee@gmail.com>

CC: Tomaž.