Bug 136805 - Export to PDF/A-1a is not PDF/A conformant
Summary: Export to PDF/A-1a is not PDF/A conformant
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
7.0.1.2 release
Hardware: All All
: medium normal
Assignee: Jan-Marek Glogowski
URL:
Whiteboard: target:7.1.0 target:7.0.2
Keywords: bibisected, regression
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2020-09-16 09:56 UTC by jan.prochaska
Modified: 2020-10-03 18:00 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Input PDF (not a PDF/A-1a) (15.00 KB, application/pdf)
2020-09-16 09:56 UTC, jan.prochaska
Details
Output from version 6.4.6 (is really a PDF/A-1a) (13.27 KB, application/pdf)
2020-09-16 09:57 UTC, jan.prochaska
Details
Output from version 7.0.1 (is claimed as PDF/A-1a, but in fact isn't due to the error in structure) (15.11 KB, application/pdf)
2020-09-16 09:58 UTC, jan.prochaska
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jan.prochaska 2020-09-16 09:56:53 UTC
Created attachment 165562 [details]
Input PDF (not a PDF/A-1a)

We use LibreOffice (called via UNO) to convert PDF documents to PDF/A-1a format.
Version 6.4.6 (and earlier) produced PDF/A-1a reliable (i.e. the result was claimed to be PDF/A-1a and it has been a valid PDF/A-1a when checking via validator, see below). The version 7.0.1 creates a document, but it is not conformant do PDF/A specification and the 1a profile.

We are using the VeraPDF validator and the PDFs coming from the new version suffer from this problem (as claimed by the validator):

Specification: ISO 19005-1:2005, Clause: 6.7.3, Test number: 1	
If a document information dictionary does appear at a document, then all of its entries that have analogous properties in predefined XMP schemas, shall also be embedded in the file in XMP form with equivalent values.	Failed
1 occurrences 	Hide
CosDocument	
doesInfoMatchXMP	
root

The problem can be reproduced by the VeraPDF CLI utility or using its online demo (https://demo.verapdf.org/)

I've attached original documents and output from 6.4.6 and 7.0.1.

Kind regards,
Jan Prochaska
Comment 1 jan.prochaska 2020-09-16 09:57:22 UTC
Created attachment 165563 [details]
Output from version 6.4.6 (is really a PDF/A-1a)
Comment 2 jan.prochaska 2020-09-16 09:58:01 UTC
Created attachment 165564 [details]
Output from version 7.0.1 (is claimed as PDF/A-1a, but in fact isn't due to the error in structure)
Comment 3 jan.prochaska 2020-09-16 12:12:05 UTC
Simmary:
Libre office 7.0.1 is not really producing PDF/A documents.

Steps to Reproduce:
We use the uno interface on a localhost socket
with following filterparam

filterParam.Name = "SelectPdfVersion";
filterParam.Value = Integer.valueOf(1); // 0 = PDF 1.4 (default selection). 1 = PDF/A-1 (ISO 19005-1:2005)
filterParams.add(filterParam);

as we have discovered later, the PDF/A-1a is not an option on 7.0.1 LibreOffice GUI anymore (so I do not know if param value 1 is still valid), but the the behaviour is reproducible via GUI as well:
1. Open the attached PDF in LibreOffice
2. Choose File -> Export as PDF with PDF/A checked (in the new version we've tried PDF/A-1b.

Actual Results:  
PDF is produced and metadata claim PDF/A conformance level/profile.
In fact there is a problem with the structure (pls. see first post).

Expected Results:
The PDF produced is really a PDF/A, passing validation checks. Right now the 7.0.1 fails, the 6.4.6 (and earlier) seems to do fine.

Reproducible: Always
Comment 4 Jan-Marek Glogowski 2020-09-16 22:10:05 UTC
Regression from commit

commit d016e052ddf30649ad9b729b59134ce1e90a0263

    pdf: extract XMP metadata writing and use XmlWriter

Pending patch: https://gerrit.libreoffice.org/c/core/+/102888

Would be nice to optionally use VeraPDF as a verifier for our PDF export unit tests, just like the ODF validator, but that is a lot more work tom implement.
Comment 5 Commit Notification 2020-09-17 05:34:12 UTC
Jan-Marek Glogowski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/06e35b3289090ec623fe5284976ee6f40681e1d5

tdf#136805 PDF export: re-add XMP basic meta data

It will be available in 7.1.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Commit Notification 2020-09-17 08:47:56 UTC
Jan-Marek Glogowski committed a patch related to this issue.
It has been pushed to "libreoffice-7-0":

https://git.libreoffice.org/core/commit/4684f8e09ac540a85b843b2306a9e9edeb8c17ec

tdf#136805 PDF export: re-add XMP basic meta data

It will be available in 7.0.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2020-09-17 14:40:33 UTC
Jan-Marek Glogowski committed a patch related to this issue.
It has been pushed to "libreoffice-7-0-2":

https://git.libreoffice.org/core/commit/749b206f1d2e31128762be9d727f1dc4c764aed5

tdf#136805 PDF export: re-add XMP basic meta data

It will be available in 7.0.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 jan.prochaska 2020-10-03 17:55:22 UTC
I have retested 7.0.2.2.
The problem is gone.
Thank you.
Kind regards,
Jan Prochaska