Bug 155228 - Syntax error in PDF for Tab key of page object
Summary: Syntax error in PDF for Tab key of page object
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Michael Stahl (allotropia)
URL:
Whiteboard: target:7.6.0 target:7.5.4
Keywords: regression
: 151826 (view as bug list)
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2023-05-10 08:42 UTC by peter.wyatt
Modified: 2024-08-22 04:50 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description peter.wyatt 2023-05-10 08:42:43 UTC
LibreOffice has a syntax error in the PDFs it generates. The /Tabs key of a PDF page object should be a PDF name, not a PDF string. Refer to PDF specification: ISO 32000-2, "Table 31 - Entries in a page object". 

Problem is caused by code in core/vcl/source/gdi/pdfwriter_impl.cxx at line 753:
   aLine.append( "/Tabs(S)\n" );
should be:
   aLine.append( "/Tabs/S\n" );

https://github.com/LibreOffice/core/blob/c84b37c0bbab3b386b22b87be52f965839b44a49/vcl/source/gdi/pdfwriter_impl.cxx
Comment 1 Julien Nabet 2023-05-11 16:28:52 UTC
peter.wyatt: just for curiosity, is the spec "ISO 32000-2" available for free and legally somewhere?
Indeed, I only found available:
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
where "Entries in a page object" is table 30 and indicates:
"(Optional; PDF 1.5) A name specifying the tab order that shall be
used for annotations on the page. The possible values shall be R
(row order), C (column order), and S (structure order). See 12.5,
"Annotations" for details." Badfully, searching "/Tabs" to try to have an example gives nothing.

However, doing some research, I rather see "/Tabs /S" (so with a space between "/Tabs" and "/S", now I don't know if it's important or not.

Michael: noticing https://cgit.freedesktop.org/libreoffice/core/commit/?id=fa3f04bdd4f73a1b3be70dfb709c44638ef7e3d9
tdf#148934 PDF/UA export: add Contents entry to Link annotations

thought you might have some opinion here.
Comment 2 peter.wyatt 2023-05-11 22:31:42 UTC
Yes, the latest ISO 32000-2:2020 specification is available at no-cost for everyone: see https://pdfa.org/sponsored-standards/. This also includes errata and some new cryptographic extensions.

Please stop using the ISO 32000-1 spec from way back in 2008! It is very very old, out-of-date, not entirely vendor neutral, and has many known issues that have since been corrected.

Regarding needing a SPACE between the key "/Tab" and the value "/S": technically PDF does not need this as "/" is an explicit token delimiter (see sub-clause 7.2.3 Character set). Adding one won't hurt, only that it is unnecessary and will make the resultant PDF file just slightly larger (1 byte for each page). All PDF parsers support the non-whitespace syntax.
Comment 3 peter.wyatt 2023-05-11 22:33:34 UTC
I should add the convention in all PDF specs is not to preface PDF name objects with the leading "/" - so just search for "Tabs" (case sensitive, whole words).
Comment 4 Commit Notification 2023-05-12 11:46:21 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0025190152a35e18c7847e91ad171df339657910

tdf#155228 vcl: PDF export: /Tabs needs PDF name, not string

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 5 Commit Notification 2023-05-15 07:52:42 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/d93c05579ec9759eca3203ad3608c8f55e1f12b6

tdf#155228 vcl: PDF export: /Tabs needs PDF name, not string

It will be available in 7.5.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 6 Julien Nabet 2023-05-20 13:27:32 UTC
(In reply to peter.wyatt from comment #2)
> Yes, the latest ISO 32000-2:2020 specification is available at no-cost for
> everyone: see https://pdfa.org/sponsored-standards/. This also includes
> errata and some new cryptographic extensions.
I'm just looping in the link trying to download the specs.
One time, I got a form to fill with private coordinates but didn't want to respond.
I would have expected just a link when clicking on it the download would begin.
Comment 7 peter.wyatt 2023-05-21 00:03:00 UTC
If you read the EULA you will understand that all ISO standards are licensed by ISO under per-user conditions. This is "sponsored access" meaning that someone else is generously paying on your behalf - but that cannot/does not change ISO's licensing and copyright.
Comment 8 Julien Nabet 2023-05-21 07:15:55 UTC
(In reply to peter.wyatt from comment #7)
> If you read the EULA you will understand that all ISO standards are licensed
> by ISO under per-user conditions. This is "sponsored access" meaning that
> someone else is generously paying on your behalf - but that cannot/does not
> change ISO's licensing and copyright.

Ok, quite weird for an "open" format.
Comment 9 Gabor Kelemen (allotropia) 2023-07-04 20:30:10 UTC
*** Bug 151826 has been marked as a duplicate of this bug. ***
Comment 10 Commit Notification 2024-08-05 15:33:25 UTC Comment hidden (off-topic)
Comment 11 Michael Stahl (allotropia) 2024-08-05 15:59:56 UTC
oops, previous commit was for bug 140289