Bug Hunting Session
Bug 113448 - PDF/A-1a generated is not compliant
Summary: PDF/A-1a generated is not compliant
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
5.2.7.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard: target:6.3.0 target:6.4.0 target:6.3.0.1
Keywords:
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2017-10-25 14:58 UTC by paolog
Modified: 2019-09-02 08:20 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
LibreOffice test files and the exported (would be) PDF/A-1a files (4.16 MB, application/zip)
2017-10-25 15:00 UTC, paolog
Details
PDF/A-1a exported on Windows with LO 5.4.2.2 (22.65 KB, application/pdf)
2017-10-28 09:19 UTC, paolog
Details
Verapdf 1a validation report for Aaa_5.4.2.2.pdf (1.83 KB, text/xml)
2017-10-28 09:20 UTC, paolog
Details

Note You need to log in before you can comment on or make changes to this bug.
Description paolog 2017-10-25 14:58:04 UTC
Description:
I have validated the PDFs exported from LibreOffice Calc using VeraPDF, and sometimes they do not validate.

I attach two examples with differing errors, and file that passes.

Steps to Reproduce:
1. From the attached files Aaa.ods or Bbb2.pds export the PDF ticking the PDFA/1-a checkbox

2. Validate the resulting PDF with verapdf:

/opt/verapdf/verapdf -f 1a Aaa.pdf
/opt/verapdf/verapdf -f 1a Bbb2.pdf
/opt/verapdf/verapdf -f 1a Ccc.pdf


Actual Results:  
Aaa.pdf fails with:

<rule specification="ISO 19005-1:2005" clause="6.8.3" testNumber="1" status="failed" passedChecks="0" failedChecks="1">
  <description>The logical structure of the conforming file shall be described by a structure hierarchy rooted in the StructTreeRoot entry of the document catalog dictionary, as described in PDF Reference 9.6</description>
  <object>PDDocument</object>
  <test>StructTreeRoot_size == 1</test>
  <check status="failed">
    <context>root/document[0]</context>
  </check>
</rule>

Bbb.pdf fails with:

<rule specification="ISO 19005-1:2005" clause="6.3.8" testNumber="1" status="failed" passedChecks="0" failedChecks="9">
  <description>The font dictionary shall include a ToUnicode entry whose value is a CMap stream object that maps character codes to Unicode values, 
              as described in PDF Reference 5.9, unless the font meets any of the following three conditions:
              (*) fonts that use the predefined encodings MacRomanEncoding, MacExpertEncoding or WinAnsiEncoding, or that use the predefined Identity-H or Identity-V CMaps;
              (*) Type 1 fonts whose character names are taken from the Adobe standard Latin character set or the set of named characters in the Symbol font, as defined in PDF Reference Appendix D;
              (*) Type 0 fonts whose descendant CIDFont uses the Adobe-GB1, Adobe-CNS1, Adobe-Japan1 orAdobe-Korea1 character collections.</description>
  <object>Glyph</object>
  <test>toUnicode != null</test>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[173]/usedGlyphs[2](BAAAAA+LiberationSans 22 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[102]/usedGlyphs[0](BAAAAA+LiberationSans 18 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[173]/usedGlyphs[0](BAAAAA+LiberationSans 20 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[102]/usedGlyphs[1](BAAAAA+LiberationSans 19 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[68]/usedGlyphs[3](BAAAAA+LiberationSans 16 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[68]/usedGlyphs[4](BAAAAA+LiberationSans 17 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[173]/usedGlyphs[1](BAAAAA+LiberationSans 21 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[68]/usedGlyphs[2](BAAAAA+LiberationSans 15 0)</context>
  </check>
  <check status="failed">
    <context>root/document[0]/pages[0](1 0 obj PDPage)/contentStream[0](2 0 obj PDContentStream)/operators[68]/usedGlyphs[1](BAAAAA+LiberationSans 14 0)</context>
  </check>
</rule>

Expected Results:
Ccc.pdf is an example of a file which passes:

<?xml version="1.0" encoding="utf-8"?>
<report>
  <buildInformation>
    <releaseDetails id="core" version="1.6.2" buildDate="2017-06-05T20:07:00+02:00"></releaseDetails>
    <releaseDetails id="validation-model" version="1.6.2" buildDate="2017-06-12T10:32:00+02:00"></releaseDetails>
    <releaseDetails id="gui" version="1.6.3" buildDate="2017-06-12T10:44:00+02:00"></releaseDetails>
  </buildInformation>
  <jobs>
    <job>
      <item size="26665">
        <name>/home/paolog/Ccc.pdf</name>
      </item>
      <validationReport profileName="PDF/A-1A validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true">
        <details passedRules="107" failedRules="0" passedChecks="844" failedChecks="0"></details>
      </validationReport>
      <duration start="1508943351038" finish="1508943351871">00:00:00.833</duration>
    </job>
  </jobs>
  <batchSummary totalJobs="1" failedToParse="0" encrypted="0">
    <validationReports compliant="1" nonCompliant="0" failedJobs="0">1</validationReports>
    <featureReports failedJobs="0">0</featureReports>
    <repairReports failedJobs="0">0</repairReports>
    <duration start="1508943350954" finish="1508943351917">00:00:00.963</duration>
  </batchSummary>
</report>

the only change from Bbb2.ods is that I replaced the content of cell A1 !


Reproducible: Always

User Profile Reset: No, I don't believe it's relevant.

Additional Info:
https://wp.libpf.com/?p=1158
http://verapdf.org/software/


User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Comment 1 paolog 2017-10-25 15:00:31 UTC
Created attachment 137287 [details]
LibreOffice test files and the exported (would be) PDF/A-1a files
Comment 2 Julien Nabet 2017-10-25 15:28:31 UTC
5.2 is EOL.
Could you give a try to a recent LO version? Last stable one is 5.3.6 and there's also brand new 5.4.2
Comment 3 paolog 2017-10-28 09:18:18 UTC
Hi, I have just tested on Windows with LO 5.4.2.2, and I also updated Verapdf to version 1.8.2

It turns out the Bbb2 and Ccc test cases pass, but Aaa yields the same error. 

I attach the new PDF file and the full XML output from verapdf:

/opt/verapdf/verapdf -f 1a Aaa_5.4.2.2.pdf
Comment 4 paolog 2017-10-28 09:19:39 UTC
Created attachment 137335 [details]
PDF/A-1a exported on Windows with LO 5.4.2.2
Comment 5 paolog 2017-10-28 09:20:20 UTC
Created attachment 137336 [details]
Verapdf 1a validation report for Aaa_5.4.2.2.pdf
Comment 6 Buovjaga 2017-11-09 15:24:40 UTC
Confirmed Aaa fails with http://demo.verapdf.org/

Arch Linux 64-bit, KDE Plasma 5
Version: 6.0.0.0.alpha1+
Build ID: 4b5751dd0b08d5fe55f89513ea1062f059c493c7
CPU threads: 8; OS: Linux 4.13; UI render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on November 8th 2017
Comment 7 Julien Nabet 2017-11-16 20:24:45 UTC
On pc Debian x86-64 with master sources updated today, I don't reproduce this.
I did this:
- retrieved zip file and extract it
- opened AAa.ods
- chose file Export As PDF...
- checked PDF/A-1a
- used http://demo.verapdf.org/ to check file
passed.
Did I miss something?
Comment 8 paolog 2017-11-16 20:44:06 UTC
Which Debian release ? 

Currently stretch has libroffice 5.2.7, stretch-backports and buster (testing) have 5.4.2 while sid (unstable) has 5.4.3.

Also on demo.verapdf-org did you select the PDF/A-1a profile ?
Comment 9 Julien Nabet 2017-11-16 20:59:09 UTC
I had indeed missed the VeraPdf setting, now I can reproduce the pb.
Sorry for the confusion.
Comment 10 Julien Nabet 2017-11-23 15:10:20 UTC
Cor: just for information, I had submitted a patch which made the validation ok but since I didn't know the impact in general, I abandoned it.
https://gerrit.libreoffice.org/#/c/44920/
Perhaps it could help, at least it can be useful as code pointer.

If you know an LO dev expert in pdf part, don't hesitate.
Comment 11 Cor Nouws 2017-11-23 17:08:15 UTC
Thanks Julien.

(In reply to Julien Nabet from comment #10)

> If you know an LO dev expert in pdf part, don't hesitate.

jenkins says OK, and you see it works, so why not push?
Our users & QA will discover if it breaks anything.
Comment 12 Julien Nabet 2017-11-23 17:56:20 UTC
(In reply to Cor Nouws from comment #11)
> jenkins says OK, and you see it works, so why not push?
> Our users & QA will discover if it breaks anything.

Because of the Miklos' comments in gerrit patch.
Jenkins says it's ok for the build and the tests associated but I don't think there are hundreds of pdfs tested.
This patch makes this case ok but, since neither Miklos nor I know pdf enough, it could trigger lots of regressions.
I wouldn't like to potentially provide a lot of extra work to QA, there's already enough when you see the number of existing cases just to fix this bugtracker.

The next step could be to propose the patch on dev mailing list but since I guessed more this patch than really understood how the whole thing worked, I suppose I'd be bashed :-)
Comment 13 QA Administrators 2018-11-24 03:44:23 UTC Comment hidden (obsolete)
Comment 14 Xavier Van Wijmeersch 2018-12-09 21:30:43 UTC
downloaded the latest version of verapdf 1.13.24

The only one that not past the test is Aaa.pdf
The other two past the test with no errors messages

Version: 6.3.0.0.alpha0+
Build ID: d6e6745683c80b66349eb82581a862fcc5961575
CPU threads: 8; OS: Linux 4.19; UI render: default; VCL: kde4; 
Locale: nl-BE (en_US.UTF-8); UI-Language: en-US
Calc: threaded

Aaaa.pdf generated with master build

/opt/verapdf/verapdf -f 1a /root/Downloads/test_files/Aaaa.pdf
<?xml version="1.0" encoding="utf-8"?>
<report>
  <buildInformation>
    <releaseDetails id="core" version="1.13.7" buildDate="2018-08-22T12:03:00+02:00"></releaseDetails>
    <releaseDetails id="validation-model" version="1.13.20" buildDate="2018-08-31T08:35:00+02:00"></releaseDetails>
    <releaseDetails id="gui" version="1.13.24" buildDate="2018-08-31T08:49:00+02:00"></releaseDetails>
  </buildInformation>
  <jobs>
    <job>
      <item size="24347">
        <name>/root/Downloads/test_files/Aaaa.pdf</name>
      </item>
      <validationReport profileName="PDF/A-1A validation profile" statement="PDF file is not compliant with Validation Profile requirements." isCompliant="false">
        <details passedRules="106" failedRules="1" passedChecks="611" failedChecks="1">
          <rule specification="ISO 19005-1:2005" clause="6.8.3" testNumber="1" status="failed" passedChecks="0" failedChecks="1">
            <description>The logical structure of the conforming file shall be described by a structure hierarchy rooted in the StructTreeRoot entry of the document catalog dictionary, as described in PDF Reference 9.6</description>
            <object>PDDocument</object>
            <test>StructTreeRoot_size == 1</test>
            <check status="failed">
              <context>root/document[0]</context>
            </check>
          </rule>
        </details>
      </validationReport>
      <duration start="1544494065287" finish="1544494067694">00:00:02.407</duration>
    </job>
  </jobs>
  <batchSummary totalJobs="1" failedToParse="0" encrypted="0">
    <validationReports compliant="0" nonCompliant="1" failedJobs="0">1</validationReports>
    <featureReports failedJobs="0">0</featureReports>
    <repairReports failedJobs="0">0</repairReports>
    <duration start="1544494065050" finish="1544494067808">00:00:02.758</duration>
  </batchSummary>
</report>
Comment 15 Commit Notification 2019-03-16 22:42:19 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/6907cbd0f8e198a0f1810b1a07f552a47c9da660%5E%21

tdf#113448 fix type1 font subsetter

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2019-03-17 19:21:58 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/4859d0b6cee9477ab65e86923e7c0a0b88022d8e%5E%21

tdf#113448 fix PDF forms export

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Xisco Faulí 2019-04-17 15:06:03 UTC
A polite ping to Thorsten Behrens:
Is this bug fixed? if so, could you please close it as RESOLVED FIXED ? Otherwise, Could you please explain what's missing?
Thanks
Comment 18 Commit Notification 2019-07-02 17:18:27 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/6ec26ba3aa195eac62fb8803137070d23a69491c%5E%21

tdf#113448 don't export any font for radio buttons

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2019-07-02 17:21:06 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/35f71c648c45769d4cc75f8b422bcdb020916a73%5E%21

tdf#113448 Export font used for checkbox mark

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Thorsten Behrens (CIB) 2019-07-02 17:22:56 UTC
With the above two commits, all validation errors _I_ am aware of (via verapdf) seem solved, so setting this to fixed now.
Comment 21 Commit Notification 2019-07-03 11:47:31 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/+/c0c4152c6077839d0e6c65dac64fde894d6aafbe%5E%21

tdf#113448 Export font used for checkbox mark

It will be available in 6.3.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 22 Commit Notification 2019-07-03 12:50:01 UTC
Thorsten Behrens committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/+/76b5dca9dc0ff60f8f62cbecdee68f8f3b287ceb%5E%21

tdf#113448 don't export any font for radio buttons

It will be available in 6.3.0.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.