Bug 158173 - Draw messes up the formatting of PDFs when trying to edit them
Summary: Draw messes up the formatting of PDFs when trying to edit them
Status: RESOLVED DUPLICATE of bug 49705
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
7.6.2.1 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-10 22:02 UTC by iaminov01
Modified: 2024-03-20 13:32 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description iaminov01 2023-11-10 22:02:01 UTC
Description:
Multiple formatting issues arise, specifically with how much space certain text takes up. For example, text will be spread out to go outside the page. Or if the text is inside a table, it would spread out to outside of the table lines. 

Steps to Reproduce:
1. Open a PDF in Draw (note it does not happen with all PDFs and I cannot find a pattern of which ones it happens with, please email me if you want a screenshot of the issue). 


Actual Results:
Text leaves the borders of the table and/or the page. 

Expected Results:
PDF looks identical in Draw as it does in PDF readers like Adobe Acrobat 


Reproducible: Sometimes


User Profile Reset: No

Additional Info:
[Information automatically included from LibreOffice]
Locale: en-US
Module: DrawingDocument
[Information guessed from browser]
OS: Linux (All)
OS is 64bit: yes
Comment 1 Stéphane Guillou (stragu) 2023-11-11 07:34:43 UTC
Thanks for the repot.
Please provide an example PDF file, preferably that focuses on one specific issue, so it is more likely to get fixed.
Much appreciated!
Comment 2 V Stuart Foote 2023-11-11 12:29:24 UTC
Aside from providing a specific example needing attention, please note: PDF is a final display/printing publishing format not structured to be editable.

LibreOffice is not a PDF editor. Rather it performs a filter "import" (poppler/cairo based) where we parse the PDF content and render as new content on an ODF document canvas (Draw by default, but alternatively to Writer or Impress).

During its lossy import each "object" (text run or graphic) from the PDF is parsed and converted into an appropriate LO draw object and placed. The filters do a reasonable job converting the content, but in no sense is the original PDF being edited. And the resulting draw object will probably differ from its source in PDF. There is of course potential to improve fidelity of LO import filter(s) for *some* PDF elements. 

LO provides an alternative "Insert as image" import filter based on the Chrome projects pdfium libs that will render one PDF page at a time in high-fidelity as a single inserted raster image on ODF document canvas. With this filter path ODF draw objects are not created from the PDF runs.

You can also convert your PDF externally from LO, and import PDF pages in a different format, e.g. use pdftocairo -svg (another poppler/cairo based project). Then insert the SVG content to an LO document. That filter path also works (helpful for vector graphics) but can have similar issues with fidelity of draw objects created from the SVG.

Bottom line--don't expect to use a PDF published source as an editable document, they aren't.  If you can, best to obtain the original document. If you can't, just realize the LO provided PDF import filter(s) will probably loose fidelity to published original during creation of ODF drawing objects.

A double whammy as the popple/cairo based PDF export filters are used to convert ODF drawing objects when creating a PDF export from ODF.  A "print to PDF" (i.e. gs/ps based) can create a better PDF output.
Comment 3 V Stuart Foote 2023-11-11 12:37:25 UTC
s/Chrome projects/Chromium project's
Comment 4 Stéphane Guillou (stragu) 2024-03-20 13:32:24 UTC
As we don't have a sample document, I am marking as duplicate of bug 49705 based on the description.

*** This bug has been marked as a duplicate of bug 49705 ***