Bug 157112

Summary: FILESAVE PDF Document containing emoji results in invalid PDF
Product: LibreOffice Reporter: Gabor Kelemen (allotropia) <kelemeng>
Component: Printing and PDF exportAssignee: ⁨خالد حسني⁩ <khaled>
Status: VERIFIED FIXED    
Severity: normal CC: aron.budea, khaled
Priority: medium Keywords: bibisected, bisected, regression
Version: 7.5.0.3 release   
Hardware: All   
OS: All   
See Also: https://bugs.documentfoundation.org/show_bug.cgi?id=157816
https://bugs.documentfoundation.org/show_bug.cgi?id=159689
Whiteboard: target:24.2.0 target:7.6.2
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 143999    
Attachments: Example file from Writer
The example file exported as PDF from current master
The example file exported with 7.5-bibisect before the bibisected commit
Screenshot of the PAC exception
Example file exported with not PDF export options enabled
Attempt at fixing document

Description Gabor Kelemen (allotropia) 2023-09-06 09:11:16 UTC
Created attachment 189377 [details]
Example file from Writer

Attached document contains an emoji character.

When exported to PDF, the PAC tool gives an unhandled "Index was out of range" exception, indicating some PDF invalidity.

VeraPDF also complains a lot about this file.

1. Open attached file
2. Save as PDF with PDF/UA enabled.
3. Check the resulting file in PAC 2021 tool
-> See attached screenshot with error message.


Seems to have started in 7.5 with:

https://git.libreoffice.org/core/+/506d969193822f396bb2203718124e3516ad75d1

author	Khaled Hosny <khaled@aliftype.com>	Sat Oct 29 11:12:23 2022 +0200
committer	خالد حسني <khaled@aliftype.com>	Sat Oct 29 12:04:05 2022 +0200

vcl: check the correct face for color glyphs
Comment 1 Gabor Kelemen (allotropia) 2023-09-06 09:19:25 UTC
Created attachment 189379 [details]
The example file exported as PDF from current master

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: cc7d6211bc01e5ec84dbad542605d2e93dea925c
CPU threads: 15; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded
Comment 2 Gabor Kelemen (allotropia) 2023-09-06 09:20:27 UTC
Created attachment 189380 [details]
The example file exported with 7.5-bibisect before the bibisected commit
Comment 3 Gabor Kelemen (allotropia) 2023-09-06 09:20:55 UTC
Created attachment 189381 [details]
Screenshot of the PAC exception
Comment 4 Gabor Kelemen (allotropia) 2023-09-06 09:49:02 UTC
Created attachment 189384 [details]
Example file exported with not PDF export options enabled

This also happens without enabling any of UA, PDF Archival, Tagged PDF or even any Structure option in the PDF Export dialog.
Comment 5 ⁨خالد حسني⁩ 2023-09-07 04:50:15 UTC
The “good” PDF has color emoji rendered black and while, while the “pad” has them rendered in color, so the referenced commit is doing what it was supposed to do.

It looks like this tool does not like the Type 3 font we are now embedding to get glyphs rendered in color, but the error message does not tell much to act on it.
Comment 6 ⁨خالد حسني⁩ 2023-09-07 04:56:46 UTC
Created attachment 189402 [details]
Attempt at fixing document

Does this version of the file work?
Comment 7 ⁨خالد حسني⁩ 2023-09-07 05:08:44 UTC
I downloaded the PAC tool and indeed the new PDF does not have the error.
Comment 8 Commit Notification 2023-09-07 07:02:07 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/d93f3243d51438e2492ca6f450ae3f1f63b617b1

tdf#157112: fix off-by-one error in /LastChar of PDF Type 3 fonts

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Gabor Kelemen (allotropia) 2023-09-08 12:31:09 UTC
Verified in 

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: beaea2e992912b4747d790070b26371f557b1f57
CPU threads: 15; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (hu_HU); UI: en-US
Calc: threaded

No more error message in PAC tool.
Thanks Khaled for fixing this!
Comment 10 Commit Notification 2023-09-14 16:34:30 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":

https://git.libreoffice.org/core/commit/8497b2f5837bcd7a047d0bd2de842d4b2ef1101b

tdf#157112: fix off-by-one error in /LastChar of PDF Type 3 fonts

It will be available in 7.6.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.