Bug 160033 - soffice builds of pdf files are unreproducible
Summary: soffice builds of pdf files are unreproducible
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
24.2.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:pdf
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2024-03-04 21:05 UTC by Rene Engelhard
Modified: 2026-04-04 03:14 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
ODP test file used for illustration (13.34 KB, application/vnd.oasis.opendocument.presentation)
2024-03-28 18:08 UTC, tovrstra
Details
HTML diff of PDFs with decompressed streams (30.93 KB, text/html)
2024-03-28 18:08 UTC, tovrstra
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rene Engelhard 2024-03-04 21:05:44 UTC
From https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1065448:

--- snip ---
Dear Maintainer,

When creating pdf files from odt files, soffice writes a CreationDate field
which contains the actual build date/time. This varies with every build.
For an example, see the bottom of
https://tests.reproducible-builds.org/debian/rb-pkg/trixie/amd64/diffoscope-results/winff.html

soffice could use the creation date of the input odt file,
or even SOURCE_DATE_EPOCH instead of the current system date.

Regards,
Peter
--- snip ---

I think there was an option to not add the field at all? But this probably is not exposed to command line conversion unless you configure it extra?
Comment 1 stragu 2024-03-19 07:37:37 UTC
Thorsten, what do you think?
Don't e.g. ODT files also store a timestamp at each save?
Comment 2 Rene Engelhard 2024-03-19 17:40:57 UTC
But odt files stay the same (unless changed and re-saved of course) so are per definition reproducible.

pdf files which are (in this and other cases in Debian) are rebuilt every time on every package build from a .doc/.od? differ each time.

(Or, if one wants to go that route, the "source file" (od?) stays the same anyway and the "binary" (pdf) changes. That's a possible analogy)
Comment 3 tovrstra 2024-03-28 18:08:07 UTC
Created attachment 193373 [details]
ODP test file used for illustration
Comment 4 tovrstra 2024-03-28 18:08:59 UTC
Created attachment 193374 [details]
HTML diff of PDFs with decompressed streams
Comment 5 tovrstra 2024-03-28 18:19:03 UTC
Reproducibility seems indeed not possible at this stage.

I've attached an example to show that there is more going on than just different time stamps.

Steps to reproduce the example:

1. Export slide.odp to PDF twice.
2. Decompress streams in the two PDFs with `

mutool clean -d slide1.pdf tmp1.pdf
mutool clean -d slide2.pdf tmp2.pdf

3. Generate diff html with vim:

vimdiff tmp1.pdf tmp2.pdf -c TOhtml -c 'w! diff.html' -c 'qa!'

There are four points where the PDFs differ:

- A binary stream (length is also different).
- xmp:CreateDate tag.
- /CreationDate field.
- PDF Trailer ID, which is just a random blob.

As far as I understand, random trailer IDs are sometimes useful for document tracking, but they are not critical.

It would be helpful to have an option to create reproducible PDFs, e.g. with a command-line option, or to disable all variable parts when SOURCE_DATE_EPOCH is set.
Comment 6 stragu 2024-04-03 07:55:02 UTC
OK, let's set as new.
Comment 7 QA Administrators 2026-04-04 03:14:21 UTC
Dear Rene Engelhard,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug