Bug 34135 - Impress PDF export loses image Title and Description
Summary: Impress PDF export loses image Title and Description
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
3.5.0 RC1
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: accessibility, filter:pdf
Depends on:
Blocks: a11y, Accessibility PDF-Export
  Show dependency treegraph
 
Reported: 2011-02-10 06:54 UTC by Christophe Strobbe
Modified: 2021-08-11 09:01 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample Impress ODP - rubber duck (81.57 KB, application/vnd.oasis.opendocument.presentation)
2016-09-13 16:24 UTC, V Stuart Foote
Details
sample export to Tagged PDF -- tags lost for image (80.05 KB, application/pdf)
2016-09-13 16:27 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christophe Strobbe 2011-02-10 06:54:43 UTC
Images in Impress lose their text alternatives when a presentation is exported as PDF. Images inserted in presentations can have a Title and Description ("text alternatives"). This is an accessibility requirement for users of screen readers (blind users) and for exporting to non-visual formats such as audio books (e.g. in DAISY format) and Braille.

To reproduce the issue, follow these steps:
1. Create a presentation in Impress; add a picture to a slide.
2. Right-click on the picture and select Description from the context menu.
3. Add a short description in the Title field and a longer description in the Description field.
4. Save the presentation; then export it as PDF. In the PDF Options dialog, check at least "Tagged PDF" (the PDF is mostly inaccessible without those tags).
5. Open the resulting PDF file in a program that allows you to check the accessibility of the PDF, e.g. Acobe Acrobat Pro (Adobe Reader can only check whether PDF is tagged or not, it can't perform a full accessibility evaluation), the free PDF Accessibility Checker (PAC) by access-for-all.ch, or the online PDF checker at http://accessibility.egovmon.no/en/pdfcheck/. The accessibility report will say that an image is missing a text alternative.

(LibreOffice Writer does not lose text alternatives on images when exporting to PDF.)
Comment 1 Christophe Strobbe 2011-07-26 03:32:59 UTC
Additional information from "Open Document Format for Office Applications (OpenDocument) v1.1", the OASIS standard.

From section 9.2.20: Title and Description:
"The <svg:title> and <svg:desc> elements specify text-only description strings for graphical objects as specified in §5.4 of [SVG].
The <svg:title> element is used as a short accessible name.
(...)
The <svg:desc> element is used for the long description in support of accessibility.
(...)
See appendix E for guidelines how to use these elements."

So <svg:title> needs to be exported as text alternative for the image in PDF.
Comment 2 Björn Michaelsen 2011-12-23 11:52:37 UTC Comment hidden (obsolete)
Comment 3 Christophe Strobbe 2012-01-27 03:40:30 UTC
I confirm that this bug is still relevant to LibreOffice 3.5.0 RC1. Regardless whether the PDF export option "Tagged PDF" is checked or unchecked, Impress does not export the text alternatives on images. This was tested with images that have only a title, images that have only a description and images that have both a title and a description. Changing the status from NEEDINFO to NEW.
Comment 4 Christophe Strobbe 2013-08-06 20:48:28 UTC
This issues has also been reported in Apache Openoffice Bugzilla: https://issues.apache.org/ooo/show_bug.cgi?id=122965
Comment 5 Christophe Strobbe 2013-08-08 09:32:28 UTC
Some background info from OASIS Open Document Format Version 1.2 Part 1:

10.3.17 <svg:title>
The <svg:title> element specifies a name for a graphic object. (...)

10.3.18 <svg:desc>
The <svg:desc> element specifies a prose description of a graphic object that may be used to
support accessibility. See appendix D. (...)

From Appendix D, section D.1
"When transforming from another document format to OpenDocument the short names, like HTML's alt text on the <img> elements shall be mapped to the <svg:title> element."

So <svg:title> is the text alternative that should always be present if the image is not purely decorative. (See also section D.1.1 in Appendix D: "Authors should not assign names to objects having no semantic value.")
Comment 6 Christophe Strobbe 2013-08-13 15:10:49 UTC
I rechecked this issue in LibreOffice 4.1.0.4 on Windows 7 Professional (32 bits) and it is still present.
Comment 7 Manuel Razzari 2014-04-20 15:53:10 UTC
Given the historical importance of alt text for screen reader users [1], I'd say this is currently the #1 accessibility issue in PDF docs exported from Impress. 

I've verified an Impress PDF using the NVDA and JAWS screen readers, and using the "PDF Accessibility Checker" tool mentioned earlier in this ticket, and the problem is hugely noticeably. 

* One problem is that of missing text alternative, which causes that a lot of "meaningful" images in a presentation can't be properly conveyed to screen readers. [2]

* A second problem is that decorative images, such as headers / footers / logos / background can't be marked as "purely decorative". So the screen reader will read a lot of "image image image", which it should simply be able to ignore. [3]
(I reckon this may be worth of a separate bug.)

Here's the help page [4] that confirms that the existing UI for "object title entry" is meant for specifically for this.

[1] See "missing alt text" in http://webaim.org/projects/screenreadersurvey4/#problems
[2] http://www.w3.org/TR/2014/NOTE-WCAG20-TECHS-20140311/PDF1
[3] http://www.w3.org/TR/2014/NOTE-WCAG20-TECHS-20140311/PDF4
[4] https://help.libreoffice.org/Simpress/cui/ui/objecttitledescdialog/object_title_entry
Comment 8 V Stuart Foote 2014-04-20 16:38:44 UTC
Manuel, *,
(In reply to comment #7)
> Given the historical importance of alt text for screen reader users [1], I'd
> say this is currently the #1 accessibility issue in PDF docs exported from
> Impress.

As a practical matter, the PDF export filter(s) require substantial rework across all LO (and AOO) components for handling tagged PDF. Since it has to be completely refactored, probably more appropriate for the scope of work to be implementation of a PDF/UA (ISO 14289-1:2012) compliant filter ( tracked as bug 45636 ).
Comment 9 Manuel Razzari 2014-04-20 23:46:32 UTC
Stuart, 

I assume a "complete refactoring" will take a substantial amount of time. 
And what is the status of bug 45636? Doesn't look like there's much traction on that front...

What I'm trying to say is that fixing these 2 alt text issues would represent a non-trivial, perceivable benefit for disabled users who try to read PDFs exported from Impress; and for content creators who want/need to take them into account. 

That's why I insist in approaching this as a "minor bug fix" rather than a "let's fix a11y for good".
Comment 10 Yousuf Philips (jay) (retired) 2016-09-13 11:48:17 UTC
Works fine for me and i see the alt text in acrobat reader.

Version: 5.3.0.0.alpha0+
Build ID: 78404fe5549fded2eaf0c5ea6e1ca66039e995af
CPU Threads: 2; OS Version: Linux 3.19; UI Render: default; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2016-09-11_09:14:01
Locale: en-US (en_US.UTF-8); Calc: group
Comment 11 V Stuart Foote 2016-09-13 16:24:13 UTC
Created attachment 127302 [details]
Sample Impress ODP - rubber duck

@Jay,

Sorry but this is not resolved.

On Windows 10 Pro 64-bit en-US with
Version: 5.2.1.2 (x64)
Build ID: 31dd62db80d4e60af04904455ec9c9219178d620
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; 
Locale: en-US (en_US); Calc: group

attaching a sample 1 slide ODP and its LibreOffice conversion to PDF. A check of PDF in Acrobat Pro reports image container type as "figure", and no Tags. No Title, no Alternate text. It also fails with no language assigned (bug 67866).
Comment 12 V Stuart Foote 2016-09-13 16:27:28 UTC
Created attachment 127306 [details]
sample export to Tagged PDF -- tags lost for image
Comment 13 Yousuf Philips (jay) (retired) 2016-09-14 01:57:55 UTC
@Stuart: Yes it seems i made a mistake as i did it in writer rather than impress, and i just noticed that in writer if i enable "tagged pdf", only then will the alt text appear in the pdf. So it is likely that "tagged pdf" option isnt working correctly in impress.
Comment 14 paulystefan 2019-04-01 17:25:53 UTC
also in 6.1.5.2 and 6.2.1.1  

description and tags in image of pdf not available
Comment 15 Katarina Behrens (Inactive) 2019-05-06 12:37:16 UTC
Peeps please try again w/ daily (~6.3). Armin and I did some work here, the core of it is in https://cgit.freedesktop.org/libreoffice/core/commit/?id=2840352ba56a212d191cc16e08378c87672d7b73 + multiple minor fixes 

This ticket is a bit tl;dr but if this is really only about exporting img title and alt text to tagged PDF, then I can say this works
Comment 16 paulystefan 2019-08-23 22:39:40 UTC
see no change in lo 6.3.0.4
Comment 17 Timur 2019-12-04 16:19:16 UTC
Page http://accessibility.egovmon.no/en/pdfcheck/ is not available.
I used PAC 3 from https://commonlook.com/accessibility-software/pdf-validator/. 
I also found https://www.access-for-all.ch/en/pdf-lab/pdf-accessibility-checker-pac.html but didn't get it.

Using Screen reader Preview in PAC3, in "Alt" I don't see image tags in older LO 5.2 and see them in master 6.5+.
So, I hope this is resolved, so I mark WorksForMe.

If someone finds this wrong, please explain and feel free to set New again.
Comment 18 Christophe Strobbe 2021-08-10 15:05:15 UTC
Since I reported the bug, I though I should recheck it.
I created some slides in LibreOffice Impress 7.1.4.2 on Windows 10 and exported them as PDF using two sets of options: once with just "Tagged PDF" (but no PDF/UA) and once with both "Tagged PDF" and "PDF/UA" enabled. This did not result in a different treatment of text alternatives for images. I found the following:

- When an image has only a Title but no Description in Impress, the Title gets exported as a text alternative. (The length of the Title does not seem to make a difference, i.e. also short Titles get exported as text alternatives.)
- When an image has only a Description, the Description gets exported as a text alternative.
- When an image has both Title and Description, the Title and the Description get combined into a single text alternative (i.e. concatenated with ' - ' between them). I don't know what to advise in this scenario: either mapping Impress's Title to PDF's title (in Adobe Acrobat) or dropping Impress's Title altogether if there is a Description.

---

In response to Manuel Razzari's comment: Neither ODF 1.2 nor ODF 1.3 (released in April 2021) provide a feature that allows you to explicitly mark an image as decorative. See my comment on bug 143311: https://bugs.documentfoundation.org/show_bug.cgi?id=143311#c7
Comment 19 Timur 2021-08-11 09:01:04 UTC
Christophe, thanks for recheck. 
I understand that you agree with WFM bug status. If you think something shouldbe be done, feel free to explain and set New.