Bug 135192 - Accessibility of PDF export: "Export as > Tagged PDF" does not export correct tags for tables
Summary: Accessibility of PDF export: "Export as > Tagged PDF" does not export correct...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
6.4.5.2 release
Hardware: All All
: medium normal
Assignee: Michael Stahl (allotropia)
URL:
Whiteboard: target:7.5.0 target:7.4.4
Keywords: accessibility
: 143436 (view as bug list)
Depends on:
Blocks: PDF-Export PDF-Accessibility
  Show dependency treegraph
 
Reported: 2020-07-27 12:01 UTC by zainab.ali
Modified: 2023-05-24 15:24 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
The LibreOffice file containing a table (13.41 KB, application/vnd.oasis.opendocument.presentation)
2020-07-27 12:03 UTC, zainab.ali
Details
The tagged PDF export (12.74 KB, application/pdf)
2020-07-27 12:04 UTC, zainab.ali
Details
The example file exported to PDF in PAC tool (98.58 KB, image/png)
2023-01-18 17:09 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description zainab.ali 2020-07-27 12:01:49 UTC
Description:
The Tagged PDF export for LibreOffice Impress does not contain marked up tables.

Steps to Reproduce:
1. Open the attached PPT.  There is a table on the first slide.
2. Export as a Tagged PDF using File > Export as ... > Tagged PDF
3. Open the tagged PDF in a PDF explorer such as Adobe Acrobat. The tags for the table are marked up as 'P' (paragraph) tags


Actual Results:
The table contents are marked up as 'p' (paragraph) tags


Expected Results:
The table is marked up as a 'table' tag



Reproducible: Always


User Profile Reset: Yes



Additional Info:
This is an accessibility concern.  Users with screen readers are not able to identify tables in the tagged PDF export.
Comment 1 zainab.ali 2020-07-27 12:03:18 UTC
Created attachment 163618 [details]
The LibreOffice file containing a table
Comment 2 zainab.ali 2020-07-27 12:04:52 UTC
Created attachment 163619 [details]
The tagged PDF export
Comment 3 V Stuart Foote 2020-07-27 18:26:56 UTC
For bug 45636 project has implemented support for PDF/UA (ISO 14289) available with the 7.0.0rc2 release and current master/7.1.0 daily builds.

Please retest.
Comment 4 Timur 2020-07-27 19:49:55 UTC
Please explain if we can see these tags in some open source or free tool.
Comment 5 zainab.ali 2020-07-29 08:53:36 UTC
Thank you for following up.

They can be seen using the Apache PDFBox Debugger:
https://pdfbox.apache.org/1.8/commandline.html#pdfdebugger

This is a Java library and must be set up as part of a Java / JVM project.  Setup instructions can be found on the following page:
https://pdfbox.apache.org/1.8/dependencies.html

Once opened, you can examine the PDF structure tree to see the accessibility metadata.
Comment 6 V Stuart Foote 2020-07-29 12:28:47 UTC
@Christophe S. - could you comment on both the Tagged PDF, and the PDF/UA handling? Specifically what if anything is missing from the PDF/UA filter exports.
Comment 7 Xisco Faulí 2022-05-02 12:15:41 UTC Comment hidden (obsolete)
Comment 8 Simon Gaeremynck 2022-05-13 15:53:14 UTC
I can confirm this is still not fixed. The table is still exported as paragraph tags.

Tested using the following build:
Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: e5fb120a32d04e241b35a7e63894c744196f576b
CPU threads: 10; OS: Mac OS X 12.0.1; UI render: Skia/Metal; VCL: osx
Locale: en-GB (en_GB.UTF-8); UI: en-US
Calc: threaded
Comment 9 Christophe Strobbe 2022-05-14 18:57:06 UTC
I'm adding some details about the tagging that is required for accessibility. The tagging is much like in HTML:
- a Table tag for the table itself;
- a TR tag for each of the table rows;
- a TD tag for each table data cell;
- a TH tag for each table header cell.

Note that Impress does not have a mechanism for users to verify whether the first row is marked up as a header row or that the first column is marked up as a column of row headers.
In Writer, authors can select a row (typically the top row) and activate the checkbox "Repeat heading" in the Table Properties dialog. This results in the addition of an <table:table-header-rows> element that wraps the <table:table-row> elements for the number of rows that the user wants to be repeated.

However, the Table Properties dialog in Impress does not have a Text Flow tab where a header row can be defined. Due to this, we should probably export all table cells as TD cells (as opposed to exporting the first row as a row of TH cells). We can't assume that every table has a header row (in spite of ISO 14289 requiring headers in every PDF/UA-conforming table).
Comment 10 Timur 2022-06-06 13:11:56 UTC
*** Bug 143436 has been marked as a duplicate of this bug. ***
Comment 11 Commit Notification 2022-12-01 16:02:38 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/56ff8262d8ace8fd99326e290597cb901654ea11

tdf#135192 svx: PDF/UA export: implement tags for SdrTableObj

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2022-12-01 16:02:48 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0bc96b8805f2cfa2278729a9f3e56a350ddd69ad

tdf#135192 drawinglayer,svx: PDF/UA export: also tag TH for SdrTableObj

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2022-12-01 16:02:57 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/4bfa3edaeea444d46f9470d415667fb8df54c32d

tdf#135192 svx: PDF/UA export: table tag primitives only if necessary

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2022-12-01 16:06:08 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/81ef84648515965bf67afaced946227d0f63a71e

(related: tdf#135192) svx: PDF/UA export: tag background as Artifact

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Michael Stahl (allotropia) 2022-12-01 16:10:55 UTC
fixed on master

for the table headers, we guess that the first row should be a heading if the table template/style is applied with table:use-first-row-styles="true".
Comment 16 Commit Notification 2022-12-05 08:51:10 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/b9a86028d516e862c744de8ed693a43b1296780c

tdf#135192 svx: PDF/UA export: implement tags for SdrTableObj

It will be available in 7.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2022-12-06 16:49:30 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/72bba53dea57b9b2bdb7aa756a27cb311684a107

tdf#135192 drawinglayer,svx: PDF/UA export: also tag TH for SdrTableObj

It will be available in 7.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2022-12-06 16:49:33 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/a22a4ac931b953ad776831858d639acc27f846f7

tdf#135192 svx: PDF/UA export: table tag primitives only if necessary

It will be available in 7.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Commit Notification 2022-12-07 19:50:12 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/7b506aaafa7a982d19ec1cba2909f3ccfe29b130

(related: tdf#135192) svx: PDF/UA export: tag background as Artifact

It will be available in 7.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 20 Gabor Kelemen (allotropia) 2023-01-18 17:09:00 UTC
Created attachment 184760 [details]
The example file exported to PDF in PAC tool

Verified in
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: f1830bff71847a9c17715cff52383956719847fe
CPU threads: 14; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

No more ""P" structure element used as root element" warning.

There is another new warning now:

"Table" structure element used as root element