From https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1065448: --- snip --- Dear Maintainer, When creating pdf files from odt files, soffice writes a CreationDate field which contains the actual build date/time. This varies with every build. For an example, see the bottom of https://tests.reproducible-builds.org/debian/rb-pkg/trixie/amd64/diffoscope-results/winff.html soffice could use the creation date of the input odt file, or even SOURCE_DATE_EPOCH instead of the current system date. Regards, Peter --- snip --- I think there was an option to not add the field at all? But this probably is not exposed to command line conversion unless you configure it extra?
Thorsten, what do you think? Don't e.g. ODT files also store a timestamp at each save?
But odt files stay the same (unless changed and re-saved of course) so are per definition reproducible. pdf files which are (in this and other cases in Debian) are rebuilt every time on every package build from a .doc/.od? differ each time. (Or, if one wants to go that route, the "source file" (od?) stays the same anyway and the "binary" (pdf) changes. That's a possible analogy)
Created attachment 193373 [details] ODP test file used for illustration
Created attachment 193374 [details] HTML diff of PDFs with decompressed streams
Reproducibility seems indeed not possible at this stage. I've attached an example to show that there is more going on than just different time stamps. Steps to reproduce the example: 1. Export slide.odp to PDF twice. 2. Decompress streams in the two PDFs with ` mutool clean -d slide1.pdf tmp1.pdf mutool clean -d slide2.pdf tmp2.pdf 3. Generate diff html with vim: vimdiff tmp1.pdf tmp2.pdf -c TOhtml -c 'w! diff.html' -c 'qa!' There are four points where the PDFs differ: - A binary stream (length is also different). - xmp:CreateDate tag. - /CreationDate field. - PDF Trailer ID, which is just a random blob. As far as I understand, random trailer IDs are sometimes useful for document tracking, but they are not critical. It would be helpful to have an option to create reproducible PDFs, e.g. with a command-line option, or to disable all variable parts when SOURCE_DATE_EPOCH is set.
OK, let's set as new.