Bug 166676 - FILEOPEN PDF: missing letters, Ok with pdfium handling
Summary: FILEOPEN PDF: missing letters, Ok with pdfium handling
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:pdf
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2025-05-21 18:48 UTC by Piotr Kocia
Modified: 2025-05-21 22:23 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
The problematic file (276.78 KB, application/pdf)
2025-05-21 18:49 UTC, Piotr Kocia
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Piotr Kocia 2025-05-21 18:48:57 UTC
Description:
The attached file is incorrectly opened by Draw - random letters are missing from the text. I do not know how the PDF file was produced, but I suppose it has been exported from Microsoft Word.

Steps to Reproduce:
Open the attached file in LO Draw.

Actual Results:
Letters (most notably 'c' and 'z') are missing from the text.

Expected Results:
All letters are imported correctly.


Reproducible: Always


User Profile Reset: No

Additional Info:
Arch Linux release

Version: 25.2.3.2 (X86_64) / LibreOffice Community
Build ID: 520(Build:2)
CPU threads: 12; OS: Linux 6.14; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
25.2.3-2
Calc: threaded
Comment 1 Piotr Kocia 2025-05-21 18:49:49 UTC
Created attachment 200899 [details]
The problematic file
Comment 2 BogdanB 2025-05-21 19:11:55 UTC
Confirm with
Version: 25.2.3.1 (X86_64) / LibreOffice Community
Build ID: d8d1af5f77df955194e52baabe19324532ac8e8b
CPU threads: 16; OS: Linux 6.11; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded

For example at line 
A.1. Jak można określić pojęie własnośi intelektualnej (PDF)
is
A.1. Jak można określić poję ie własnośi intelektualnej (DRAW)
Comment 3 Piotr Kocia 2025-05-21 20:42:02 UTC
I should have provided an example.

The line you mentioned is rendered by zathura with mupdf backend and xournal++ as

A.1. Jak można określić pojęcie własności intelektualnej

while in Draw is is as you mentioned

A.1. Jak można określić poję ie własnośi intelektualnej
Comment 4 V Stuart Foote 2025-05-21 22:23:57 UTC
(In reply to Piotr Kocia from comment #3)

Inserting the PDF as Image with pdfium based filter handles the "pojęcie" string correctly. 

So an issue of dropped chars with the poppler/cairo import filter.