Bug 151554 - Draw/Impress PDF import messes up line justification
Summary: Draw/Impress PDF import messes up line justification
Status: RESOLVED DUPLICATE of bug 49705
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
7.5.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2022-10-15 21:41 UTC by Eyal Rozenberg
Modified: 2022-11-03 12:41 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
PDF document to import for observing the bug (13.09 KB, application/pdf)
2022-10-15 21:41 UTC, Eyal Rozenberg
Details
Screenshot of attachment 183069 imported into Draw (147.54 KB, image/png)
2022-10-15 21:43 UTC, Eyal Rozenberg
Details
Result of saving the imported attachment 183069 as ODG (30.11 KB, application/vnd.oasis.opendocument.graphics)
2022-10-15 21:50 UTC, Eyal Rozenberg
Details
Result of saving the imported attachment 183069 as ODG (28.63 KB, application/vnd.oasis.opendocument.graphics)
2022-10-15 21:51 UTC, Eyal Rozenberg
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2022-10-15 21:41:46 UTC
Created attachment 183069 [details]
PDF document to import for observing the bug

When opening a (Writer-created) PDF document with justified text, in Draw, the the resulting draw objects are aligned to the left only - and sometimes exceed the page's text boundaries. Their horizontal positioning does not correspond to their position in the PDF.

To reproduce, open the attached PDF file in Draw (not in Writer!)


Expected result: The words in the imported document are placed identically, or near-identically, on the page as they are when viewing the PDF itself.

Actual result: Text is aligned to the left only, and is not bounded within the same rectangle as in the PDF (i.e. does not respect the page margins), on some lines. A screenshot will be attached soon.

Version: 7.5.0.0.alpha0+ / LibreOffice Community
Build ID: a09c5c69e3b5fbf448cae1d6c476f39067e40023
CPU threads: 4; OS: Linux 5.19; UI render: default; VCL: gtk3
Locale: en-IL (en_IL); UI: en-US


Note: 
This is a sibling of bug 151552, but for the Draw/Impress PDF import filter. The bugs are filed separately because the behavior is somewhat different: In Writer I don't see any text outside the text area, for the same document I checked with.
Comment 1 Eyal Rozenberg 2022-10-15 21:43:02 UTC
Created attachment 183070 [details]
Screenshot of attachment 183069 [details] imported into Draw
Comment 2 Eyal Rozenberg 2022-10-15 21:50:00 UTC
Created attachment 183071 [details]
Result of saving the imported attachment 183069 [details] as ODG

The resulting ODG file. Note in particular:

* The multiple space characters within the line exceeding the text boundaries.
* The text box width is not equal to the line width (i.e. margin to margin)
* The text boxes text justification is not set to Justified
* The text boxes do not have the "Full width" option set
* The text boxes have the "fit width to text" option set
Comment 3 Eyal Rozenberg 2022-10-15 21:51:22 UTC
Created attachment 183072 [details]
Result of saving the imported attachment 183069 [details] as ODG

Whoops, attached the wrong ODG file.
Comment 4 V Stuart Foote 2022-10-16 07:10:22 UTC
Spacing of the text runs is something that can not be efficiently extracted from the PDF, IMHO NAB and => WF

LibreOffice is not a PDF editor. When a user choses to filter import a source PDF to LO, they *must* understand the content of the PDF is being extracted and constituent elements rendered as drawing Shapes to document canvas. Draw by default or optionally Impress or Writer.

It is time for UX and ESC to flatly state what project will do regards PDF source materials--up to an including *removal* of the PDF import filters to eliminate the misguided perception that LibreOffice is a PDF editor.
Comment 5 Heiko Tietze 2022-11-02 14:28:06 UTC
With left-aligned content everything looks fine but in case of justified text the filter converts the spacing into three white spaces. Any idea if we can improve this, Miklos?

Actually the whole justified attribute is ignored. Adding two letters to "tortor" fits the line into the document margin. Looks as if we try to simulate something that Draw cannot handle yet, paragraph styles.

Opening the PDF in Inkscape works for me, the lines are aligned pixel-perfect and the text is spaced properly.

And last but least looking through the META ticket brings up the duplicate. 

For testing: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Morbi enim nunc faucibus a pellentesque sit amet porttitor. Egestas purus viverra accumsan in nisl nisi. Congue quisque egestas diam in arcu cursus. Velit laoreet id donec ultrices tincidunt arcu non sodales neque. Ante metus dictum at tempor commodo. Et tortor consequat id porta nibh venenatis cras sed. At consectetur lorem donec massa. Id consectetur purus ut faucibus pulvinar elementum integer. Convallis posuere morbi leo urna molestie at. Enim ut sem viverra aliquet. Sagittis aliquam malesuada bibendum arcu vitae elementum curabitur vitae nunc. In aliquam sem fringilla ut morbi. Orci sagittis eu volutpat odio. In massa tempor nec feugiat nisl pretium fusce id.

*** This bug has been marked as a duplicate of bug 49705 ***
Comment 6 Miklos Vajna 2022-11-03 12:41:24 UTC
> Any idea if we can improve this, Miklos?

Sorry, not really, poppler is not really an area I'm familiar with.