Download it now!
Bug 132683 - No alt text on image converted to pdf
Summary: No alt text on image converted to pdf
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
(earliest affected) release
Hardware: All All
: medium minor
Assignee: Not Assigned
Keywords: accessibility, bibisectRequest, regression
Depends on:
Reported: 2020-05-04 14:55 UTC by Rhys Young
Modified: 2020-05-21 14:31 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:

DOCX with alt text (62.19 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-05-04 14:55 UTC, Rhys Young
pdf no alt text (58.85 KB, application/pdf)
2020-05-04 14:56 UTC, Rhys Young
DOCX exported to PDF from MSO (68.12 KB, application/pdf)
2020-05-20 10:30 UTC, Timur
DOCX exported to PDF from LO 6.0 beta 2 (52.19 KB, application/pdf)
2020-05-21 06:54 UTC, Timur

Note You need to log in before you can comment on or make changes to this bug.
Description Rhys Young 2020-05-04 14:55:12 UTC
After converting docx to pdf the image no longer has it's alt text attached.

Steps to Reproduce:
1. Convert file to PDF using 'soffice --headless --nolockcheck --nodefault --nofirststartwizard --nologo --norestore --convert-to pdf --outdir /tmp /tmp/test.docx'
2. Open PDF using a viewer
3. Observe the pdf has no alt text attached to the image

Actual Results:
The image should have alt text.

Expected Results:
The image does not have alt text.

Reproducible: Always

User Profile Reset: No

Additional Info:
Comment 1 Rhys Young 2020-05-04 14:55:36 UTC
Created attachment 160341 [details]
DOCX with alt text
Comment 2 Rhys Young 2020-05-04 14:56:50 UTC
Created attachment 160342 [details]
pdf no alt text
Comment 3 Rhys Young 2020-05-04 14:57:13 UTC
Libre seems to take alt text from word as description text instead of alt text.
Comment 4 Timur 2020-05-20 09:10:44 UTC
There are 2 issues here: 

1. Fileopen DOCX opens Alt Text from MSO as Description in LO, not as Alternative.
Help doesn't explain Description and for "Alternative text" says "Enter the text to display in a web browser when the selected item is unavailable."

2. Export as PDF doesn't export alt text from neither field.  
Must be tested with PDF reader that supports alt text. 
While not clear how Description should be exported, Alternative doesn't work.
Regression, because (after 5.0 didn't) LO 6.0 used to export properly "Alternative text" (as explained in 1. not from DOCX but typed) and LO 6.1 doesn't again.

Let's start from 2. so I remove docx from the title.  
There are other bugs with alt text interop issues for other objects and formats.
During testing with master 7.0+, image wasn't exported once if Alternative was set in Image-Options, I can't say why.  

Note: headless is not needed and is wrong to report unless it happens only in headless, which is not the case here.
Comment 5 Timur 2020-05-20 10:30:49 UTC
Created attachment 161034 [details]
DOCX exported to PDF from MSO

In PDF from MSO, Alt Text appears in Adobe Reader in Windows.
Comment 6 Timur 2020-05-20 10:49:18 UTC
LO 6.0 beta used to export properly "Alternative text" and LO 6.0.7 doesn't again, per test in Windows.

I tried bibisect 6.0 in Linux but I couldn't see alttext, just missing image bug that was fixed with: 

b1008b030246939187e5c30ba750d6abb397161d is the first fixed commit
commit b1008b030246939187e5c30ba750d6abb397161d
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Thu Jun 22 02:10:02 2017 +0200

    source sha:77da7b934d782153be9271605691ceee6c66233a
    source sha:77da7b934d782153be9271605691ceee6c66233a
    source sha:48da675a67a2bfd2eadfd6d4c6dba0dee74b5326
    source sha:9b68ce7b0f2326ec540717ec5c8207825403774e
    source sha:d2e4aeb929b346acd0d1a2eaeee7237b89b99474
    source sha:08792a4b332d907c72d1fc7301133f5b306ec8dd
    source sha:d7824bf16898d8cb776420e0c2bff82e6df61b86
    source sha:f05d0d05829dd51cb9d8071ac97cc219779ee40a
    source sha:266bcae306a1dd6e0d9df80ba30ade7311385c28
    source sha:08316e5edfc36ed75a4e8dc5b6aa7eea3af4eea9
    source sha:136ce64b18283acf9db5d130f8ac9108591dd4ee
    source sha:b29bae1064c9f980cc50a667e8b96c5e370326d7
    Previous source sha:c0ce1ec3736be861a2ed58827fadb25269ab0117

I hope bibisect may be done in Windows.
Comment 7 Buovjaga 2020-05-20 15:07:47 UTC
(In reply to Timur from comment #6)
> LO 6.0 beta used to export properly "Alternative text" and LO 6.0.7 doesn't
> again, per test in Windows.

Can you give me the exact commit when it worked in 6.0?

I tried Win 6.0 repo and was unable to find a commit where it worked. Tried oldest, master, then git checkout HEAD~500 from master, twice.
Comment 8 Timur 2020-05-21 06:54:57 UTC
Created attachment 161059 [details]
DOCX exported to PDF from LO 6.0 beta 2

Here is where it works, as shown in the attached.

Build ID: 13edaaa12f25de343fce136064e27da66c1c4fa4
CPU threads: 8; OS: Windows 6.1; UI render: GL; 
Locale: bs-BA (bs_BA); Calc: CL

Please note that you must type in AltText or have ODT saved with it (for headless), original DOCX will not work.
Comment 9 Timur 2020-05-21 07:21:33 UTC Comment hidden (obsolete)
Comment 10 Buovjaga 2020-05-21 13:33:04 UTC
(In reply to Timur from comment #9)
> I don't know if there's a better way to translate
> 13edaaa12f25de343fce136064e27da66c1c4fa4 to bibisect commit, but I took 
> source and found ae1bb1166afa8ea6abdb656cbd9a7e6075db9313. 
> Linux doesn't export AltText headless or GUI, or this is something even more
> strange.

Ok, taking 3rd commit from the top and doing

git log --all --grep='2e368c5946ba1e608ff263e5892b10d02c90275b' 

in win 6.0 repo gave me the bibisect commit hash c1ac2cc1993f3955491bb8eb99e2b9146aaec4be

I still don't see the alt text in the pdf.

I created an ODT from the DOCX, right-clicked image - Properties - Options, added stuff to the Alternative (Text Only field).
After exporting PDF, I opened it in Acrobat Reader and hovered my mouse over the image. Nothing was shown. With your attachment 161034 [details] I can see the text in Acrobat Reader.

Can someone please tell me the valid steps to test this?? I don't care which PDF reader, Win/Linux, I have a shared folder between Linux and Win VM, just tell me the steps to confirm the alt text was saved.
Comment 11 Timur 2020-05-21 14:07:33 UTC
Well, that's it. Except if there's another bug in headless, in you used that please try GUI to be sure. 
Otherwise, we may regrettably mark NotBibisectable.
Comment 12 Buovjaga 2020-05-21 14:31:23 UTC
(In reply to Timur from comment #11)
> Well, that's it. Except if there's another bug in headless, in you used that
> please try GUI to be sure. 
> Otherwise, we may regrettably mark NotBibisectable.

GUI always. Let's allow someone else to try as well.