Bug 164983 - PDF Import: English letters rendered as black squares
Summary: PDF Import: English letters rendered as black squares
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
24.8.3.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2025-02-01 10:19 UTC by Yao Fanqing
Modified: 2025-02-22 21:32 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
open this pdf ,English letters are displayed as black squares (6.76 MB, application/pdf)
2025-02-01 10:24 UTC, Yao Fanqing
Details
Side-by-side screenshot: Attachment 198915 in LO Draw 25.8 nightly vs Atril PDF viewer (81.16 KB, image/png)
2025-02-22 21:32 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yao Fanqing 2025-02-01 10:19:03 UTC
Description:
The same PDF, opened with LibreOffice, the English letters part is displayed as a black square, the symbols and numbers are normal. Opened with mircosoft edge, everything is normal


Steps to Reproduce:
1.The same PDF, opened with LibreOffice, the English letters  part is displayed as some black square, the symbols and numbers are normal.  
2.Opened with mircosoft edge, everything is normal
3.

Actual Results:
the English part is displayed as some black square,

Expected Results:
English letters are displayed as black squares


Reproducible: Sometimes


User Profile Reset: No

Additional Info:
 no
Comment 1 Yao Fanqing 2025-02-01 10:24:09 UTC
Created attachment 198915 [details]
open this pdf ,English letters are displayed as black squares
Comment 2 Charles Williams 2025-02-01 12:50:07 UTC
Bug also manifests on macOS:

Version: 24.8.4.2 (AARCH64) / LibreOffice Community
Build ID: bb3cfa12c7b1bf994ecc5649a80400d06cd71002
CPU threads: 8; OS: macOS 15.3; UI render: default; VCL: osx
Locale: en-GB (en_GB.UTF-8); UI: en-US
Calc: threaded

The example file displays fine with Acrobat Reader.app and Preview.app
Comment 3 V Stuart Foote 2025-02-01 15:19:02 UTC
In addition to Edge, FF, Chrome, and Okular on Windows reads the PDF well.

But would think the poppler based filter handling should be a bit better job extracting the text runs.

So likely a poppler -> cairo issue handling (extraction and color) of the subset fonts.

And for what it is worth, the LibreOffice pdfium based Insert as image of each page, one at a time (first split apart externally with pdftk burst) handles all the embedded fonts from the PDF with no apparent issues. 

Though then as expected doing a Shape -> Break of the image the text blocks will get badly munged.


=-testing-=

Version: 24.8.4.2 (X86_64) / LibreOffice Community
Build ID: bb3cfa12c7b1bf994ecc5649a80400d06cd71002
CPU threads: 8; OS: Windows 10 X86_64 (10.0 build 19045); UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 7a9e303d0ffad7b83beccfe1918f962d2de04a37
CPU threads: 8; OS: Windows 10 X86_64 (build 19045); UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded
Comment 4 Eyal Rozenberg 2025-02-22 21:32:12 UTC
Created attachment 199394 [details]
Side-by-side screenshot: Attachment 198915 [details] in LO Draw 25.8 nightly vs Atril PDF viewer

Opened the PDF from attachment 198915 [details] in a LO Draw nightly from Feb 20th 2025:

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: d1f97a537b576454b2d93406d372cc4ed36d0b32
CPU threads: 4; OS: Linux 6.6; UI render: default; VCL: gtk3
Locale: en-IL (en_IL); UI: en-US
Calc: CL threaded

and in Atril (a Linux PDF viewer, used by default in the Cinnamon DE). The bug manifests clearly, but - it's not black squares I see, it's just the text not appearing at all, or alternatively, we get rectangles with a diagonal black-to-white gradient fill.