Bug 132683 - No alt text on image converted to pdf
Summary: No alt text on image converted to pdf
Description Rhys Young 2020-05-04 14:55:12 UTC
After converting docx to pdf the image no longer has it's alt text attached.

Steps to Reproduce:
1. Convert file to PDF using 'soffice --headless --nolockcheck --nodefault --nofirststartwizard --nologo --norestore --convert-to pdf --outdir /tmp /tmp/test.docx'
2. Open PDF using a viewer
3. Observe the pdf has no alt text attached to the image

Actual Results:
The image should have alt text.

Expected Results:
The image does not have alt text.

Reproducible: Always

User Profile Reset: No

Additional Info:
Comment 1 Rhys Young 2020-05-04 14:55:36 UTC
Created attachment 160341 [details]
DOCX with alt text
Comment 2 Rhys Young 2020-05-04 14:56:50 UTC
Created attachment 160342 [details]
pdf no alt text
Comment 3 Rhys Young 2020-05-04 14:57:13 UTC
Libre seems to take alt text from word as description text instead of alt text.
Comment 4 Timur 2020-05-20 09:10:44 UTC
There are 2 issues here: 

1. Fileopen DOCX opens Alt Text from MSO as Description in LO, not as Alternative.
Help doesn't explain Description and for "Alternative text" says "Enter the text to display in a web browser when the selected item is unavailable."

2. Export as PDF doesn't export alt text from neither field.  
Must be tested with PDF reader that supports alt text. 
While not clear how Description should be exported, Alternative doesn't work.
Regression, because (after 5.0 didn't) LO 6.0 used to export properly "Alternative text" (as explained in 1. not from DOCX but typed) and LO 6.1 doesn't again.

Let's start from 2. so I remove docx from the title.  
There are other bugs with alt text interop issues for other objects and formats.
During testing with master 7.0+, image wasn't exported once if Alternative was set in Image-Options, I can't say why.  

Note: headless is not needed and is wrong to report unless it happens only in headless, which is not the case here.
Comment 5 Timur 2020-05-20 10:30:49 UTC
Created attachment 161034 [details]
DOCX exported to PDF from MSO

In PDF from MSO, Alt Text appears in Adobe Reader in Windows.
Comment 6 Timur 2020-05-20 10:49:18 UTC
LO 6.0 beta used to export properly "Alternative text" and LO 6.0.7 doesn't again, per test in Windows.

I tried bibisect 6.0 in Linux but I couldn't see alttext, just missing image bug that was fixed with: 

I hope bibisect may be done in Windows.
Comment 7 Buovjaga 2020-05-20 15:07:47 UTC
(In reply to Timur from comment #6)
> LO 6.0 beta used to export properly "Alternative text" and LO 6.0.7 doesn't
> again, per test in Windows.

Can you give me the exact commit when it worked in 6.0?

I tried Win 6.0 repo and was unable to find a commit where it worked. Tried oldest, master, then git checkout HEAD~500 from master, twice.
Comment 8 Timur 2020-05-21 06:54:57 UTC
Created attachment 161059 [details]
DOCX exported to PDF from LO 6.0 beta 2

Here is where it works, as shown in the attached.

Build ID: 13edaaa12f25de343fce136064e27da66c1c4fa4
CPU threads: 8; OS: Windows 6.1; UI render: GL; 
Locale: bs-BA (bs_BA); Calc: CL

Please note that you must type in AltText or have ODT saved with it (for headless), original DOCX will not work.
Comment 10 Buovjaga 2020-05-21 13:33:04 UTC
(In reply to Timur from comment #9)
> I don't know if there's a better way to translate
> 13edaaa12f25de343fce136064e27da66c1c4fa4 to bibisect commit, but I took 
> source and found ae1bb1166afa8ea6abdb656cbd9a7e6075db9313. 
> Linux doesn't export AltText headless or GUI, or this is something even more
> strange.

Ok, taking 3rd commit from the top and doing

git log --all --grep='2e368c5946ba1e608ff263e5892b10d02c90275b' 

in win 6.0 repo gave me the bibisect commit hash c1ac2cc1993f3955491bb8eb99e2b9146aaec4be

I still don't see the alt text in the pdf.

I created an ODT from the DOCX, right-clicked image - Properties - Options, added stuff to the Alternative (Text Only field).
After exporting PDF, I opened it in Acrobat Reader and hovered my mouse over the image. Nothing was shown. With your attachment 161034 [details] I can see the text in Acrobat Reader.

Can someone please tell me the valid steps to test this?? I don't care which PDF reader, Win/Linux, I have a shared folder between Linux and Win VM, just tell me the steps to confirm the alt text was saved.
Comment 11 Timur 2020-05-21 14:07:33 UTC
Well, that's it. Except if there's another bug in headless, in you used that please try GUI to be sure. 
Otherwise, we may regrettably mark NotBibisectable.
Comment 12 Buovjaga 2020-05-21 14:31:23 UTC
(In reply to Timur from comment #11)
> Well, that's it. Except if there's another bug in headless, in you used that
> please try GUI to be sure. 
> Otherwise, we may regrettably mark NotBibisectable.

GUI always. Let's allow someone else to try as well.