Bug 169948 - Rendering bug in some pdf documents
Summary: Rendering bug in some pdf documents
Status: RESOLVED DUPLICATE of bug 165396
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
25.2.7.2 release
Hardware: x86-64 (AMD64) macOS (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2025-12-12 02:47 UTC by wlmcderm
Modified: 2025-12-12 20:40 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Sample pdf document (515.05 KB, application/pdf)
2025-12-12 02:47 UTC, wlmcderm
Details
Sample pdf document 2 (273.97 KB, application/pdf)
2025-12-12 02:48 UTC, wlmcderm
Details
Screenshot before font subsitution (54.67 KB, image/jpeg)
2025-12-12 19:10 UTC, wlmcderm
Details
Screenshot after font subsitution (52.71 KB, image/jpeg)
2025-12-12 19:11 UTC, wlmcderm
Details
Selection of one block of text in the affected line (54.39 KB, image/jpeg)
2025-12-12 19:11 UTC, wlmcderm
Details
Selection of the other block of text in the affected line (54.13 KB, image/jpeg)
2025-12-12 19:12 UTC, wlmcderm
Details

Note You need to log in before you can comment on or make changes to this bug.
Description wlmcderm 2025-12-12 02:47:09 UTC
Description:
Horizontal text layout is incorrect in some imported pdfs: text overlaps when it shouldn't. Not all pdfs are affected, not all text in the example pdfs is affected. The text layout of the example documents is correct in MacOS Preview and in Firefox.

Steps to Reproduce:
1. Open one of the sample pdfs
2. Examine the layout of the text.
3. Observe that the horizontal layout of the text is incorrect: some text overlaps

Actual Results:
Text is rendered with overlapping letters.

Expected Results:
Text is rendered without overlapping letters.


Reproducible: Always


User Profile Reset: No

Additional Info:
This bug affects LibreOffice_25.2.7, LibreOffice_25.8.3.2, and LibreOfficeDev_26.2.0.0.beta1. I haven't tried earlier versions.
Comment 1 wlmcderm 2025-12-12 02:47:56 UTC
Created attachment 204590 [details]
Sample pdf document
Comment 2 wlmcderm 2025-12-12 02:48:33 UTC
Created attachment 204591 [details]
Sample pdf document 2
Comment 3 V Stuart Foote 2025-12-12 13:07:12 UTC
These layout problems happen when the font used in the PDF is not available on the system and font fall back occurs. Fonts as subset into a PDF are not/can not reliably be used by LibreOffice (bug 101220).

If you need the text spans from a PDF for some reason, then the Draw (or Impress, or Writer) filters (poppler and cairo project based) will convert them from the PDF into drawing text box shapes.

So, for both the attached test documents, use the LibreOffice Tools -> Options -> Fonts dialog and assign both GaramondThree and AGaramondPro font to be replaced with simply Garamond.  That reduces the overlaps to a reasonable amount.

Unfortunately identifying the embedded fonts that need replacement is an extra step (filter opening the PDF and then reviewing the font reported in the properties panel for a selection of text). But it persists in user profile (affecting import of subsequent PDF).

If you need pixel perfect fidelity of a PDF, break the PDF apart and insert each page as an image (it uses a different filter path, pdfium based).

*** This bug has been marked as a duplicate of bug 165396 ***
Comment 4 wlmcderm 2025-12-12 19:09:43 UTC
Hi Stuart! Thanks for the info. I can confirm that manually adding the font substitution tables you suggested significantly ameliorates the problem.

For anyone reading, on a Mac, the steps are: LibreOffice -> Preferences -> Fonts -> (check “Apply replacement table”, enter values for “Font” and “Replace with”, then check “Always”)

I've attached a screenshot before (Screenshot.jpg) and post-font substitution (Screenshot1-post-font-sub.jpg).

However, I wonder if font substitution is the whole story. I'll also attach screenshots (Selection-1.jpg and Selection-2.jpg) showing that the affected line is imported as two different blocks of text. The horizontal placement of the two text blocks is causing the overlap in Screenshot1. If all the text in that line of the paragraph had been placed in the same block, presumably it would have been more legible even without manually substituting the font.
Comment 5 wlmcderm 2025-12-12 19:10:32 UTC
Created attachment 204600 [details]
Screenshot before font subsitution
Comment 6 wlmcderm 2025-12-12 19:11:00 UTC
Created attachment 204601 [details]
Screenshot after font subsitution
Comment 7 wlmcderm 2025-12-12 19:11:33 UTC
Created attachment 204602 [details]
Selection of one block of text in the affected line
Comment 8 wlmcderm 2025-12-12 19:12:15 UTC
Created attachment 204603 [details]
Selection of the other block of text in the affected line
Comment 9 V Stuart Foote 2025-12-12 20:32:01 UTC
(In reply to wlmcderm from comment #4)

> However, I wonder if font substitution is the whole story. I'll also attach
> screenshots (Selection-1.jpg and Selection-2.jpg) showing that the affected
> line is imported as two different blocks of text. The horizontal placement
> of the two text blocks is causing the overlap in Screenshot1. If all the
> text in that line of the paragraph had been placed in the same block,
> presumably it would have been more legible even without manually
> substituting the font.

Manifestation of the Internal structure of published PDF. The text elements are laid down with no syntactical detail nor "sense" of their relation to other text elements--just their finished published presentation on the document page. The text elements are laid down between /BT and /ET flags.  The text element strings are positioned accurately between those tags with horizontal positioning measures.

Glyphs of the font(s) used although subset are recorded into the PDF, and as the poppler based filter can not read those glyphs they must be substituted. We can explicitly substitute the font with the 'Replacement Table' as noted, or simply trust to the poppler <--> cairo fallback and object creation, but not use the embedded glyphs.

So the remaining overlap is bcz the /BT /ET text element metrics differ with the glyphs from the replacement font. The ending text of the first extends over the beginning text of the next. It can go the other direction, and you can end up with gaps rather than overlaps between adjacent text elements.

The alternative to "Opening" the PDF and using the pdfium based Insert filter always directly reads the internal layout of the PDF and the embedded subset font. So if you need fidelity, break the PDF into its pages externally, and then insert as image.  Image resolution can be controoled by setting a system variable PDFIMPORT_RESOLUTION_DPI, default is 96. 300 or 450 works well for full page rendering when placed onto an ODF document page. YMMV depending on need. And there are enhancment requests to improve handling the insert process (e.g. page range selection, resolution, rotation, etc.).
Comment 10 V Stuart Foote 2025-12-12 20:40:10 UTC
Oh, should also mention that for any single PDF being filter imported you may have multiple fonts defined for its text elements.  A single glyph can be assigned a new font in its own /BT /ET element.  Where you have additional overlaps, use the Sidebar Properties deck to select the residual overlapping texts to identify any additional fonts that may need to be substituted.

Also, remember that using the font replacement table does not remove the original PDFs fonts assigned to the text elements, it just substitutes when rendered to LibreOffice document canvas.

Kind of convoluted and you would need to copy paste to new clean document page to recreate the PDF in an "editable" form.

LibreOffice is not a PDF editor, and PDFs are non-editable final published document.