Bug 101611 - pdfimport filter does not honor page cropping (masking) as set in a PDF document, resulting pages in LO document are oversize (comment 4)
Summary: pdfimport filter does not honor page cropping (masking) as set in a PDF docum...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
5.1.4.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:pdf
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2016-08-19 11:56 UTC by E.Mi
Modified: 2023-03-19 03:25 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
pdf (1.86 MB, application/pdf)
2016-08-19 11:56 UTC, E.Mi
Details
Side by side (347.91 KB, image/jpeg)
2016-08-19 11:57 UTC, E.Mi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description E.Mi 2016-08-19 11:56:30 UTC
Created attachment 126906 [details]
pdf
Comment 1 E.Mi 2016-08-19 11:57:40 UTC
Created attachment 126907 [details]
Side by side
Comment 2 V Stuart Foote 2016-08-21 02:59:38 UTC
This is clearly a duplicate of bug 101220 -- in this case the embedded TimesNewRomanPS font family is not being extracted and used in the pdfimport filter.

The fallback font selected on Linux does not match font metrics (Windows does a bit better)--while that fallback could be improved potentially, the correct way to resolve is the extract the embedded font(s) and render the document with better fidelity using the source fonts.

@ekari, please stop making these duplicate bug submissions for PDF exhibiting poor fidelity on font substitution--they are all the same issue.

*** This bug has been marked as a duplicate of bug 101220 ***
Comment 3 E.Mi 2016-08-21 07:53:56 UTC
I was referring to the blue fluxogram that is bigger than the original and the red fluxogram has a strange image instead of a bell
Comment 4 V Stuart Foote 2016-08-21 12:23:11 UTC
The oversize bubble object is because on import the page is not being cropped and formatted as specified in the PDF.

The sample document base page is 9.92" x 6.99", and crop values of "0.583 in" top and bottom, and "0.833 in" left, "0.819 in" right are applied to the document specifying an intended size of 8.26 inch  x 5.82 inch.

The LibreOffice pdfimport filter mishandles that. It only sees the 9.92" x 6.99" base page size, and then applies margins of 0.20" left & right and 0.39" top & bottom. The crop margins that should mask/resize the page are not handled and are lost.

This is similar to issues of bug 86211 which is the general case that clipping is not implemented, but here it is more specific in that the import filter does not recognize the page cropping that should be applied. So the resulting page size of the document in LibreOffice is oversize to what is described in PDF and then additional margin space is added.

Testing a recent master on Windows 10 Pro 64-bit en-US where upgrading to current poppler (ver 0.46) as done for bug 101460 is present does not affect this aspect of the import filter.

Version: 5.3.0.0.alpha0+ (x64)
Build ID: 932804559e845fb8ec6ac3a3b49308136a7e81e6
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; 
TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2016-08-20_21:42:18
Locale: en-US (en_US); Calc: CL

Restating the issue and to NEW.

Otherwise, the Bell glyph (a PUA U+F041 symbol) is from the subset embedded MSOutlook symbol font that is bug 101220 as are the other font layout issues.
Comment 5 V Stuart Foote 2016-08-21 12:52:30 UTC
@ekari, thanks for reporting this valid issue. But please do not attach complete documents as examples. It does not help the QA process. It is much better to extract a page or two, and provide *annotated* screen clips--especially if unable to extract pages from the example PDF.

And, *please* be mindful of copyright--the commercial documents you submit are clearly not covered by a Creative Commons license. Extracting a page or two meets "fair use" tests for copyright, attaching the whole document is questionable.
Comment 6 E.Mi 2016-08-21 19:14:47 UTC
@V Stuart Foote What software do you recommend to use to extract a page without altering the contents? I tried GIMP and it removed the embedded fonts and other stuff..
Comment 7 V Stuart Foote 2016-08-21 20:55:17 UTC
I hold license for and use both iceni Infix PDF editor (v 6.50) and Adobe Acrobat (v 9.5.5) --both companies are moving to subscription based licensing for more current releases-- but either of which make unadulterated page extractions of content but tweak some of the meta data, but there are other choices.

And on Linux a number of products will allow you to "extract" pages without structural changes to the content, but again some meta data tweaks.

Master PDF Editor
https://code-industry.net/masterpdfeditor/

PDF Studio
http://www.qoppa.com/pdfstudio/
Comment 8 QA Administrators 2017-09-01 11:21:04 UTC Comment hidden (obsolete)
Comment 9 V Stuart Foote 2019-03-18 16:35:25 UTC
Issue of the pdfimport filter not correctly cropping PDF page remains in recent master/6.3.0alpha0+ build.

Version: 6.3.0.0.alpha0+
Build ID: 5fe551931d49a64ca4ea793a5016c098e41e84cd
CPU threads: 8; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: CL
Comment 10 QA Administrators 2021-03-18 04:16:53 UTC Comment hidden (obsolete)
Comment 11 QA Administrators 2023-03-19 03:25:29 UTC
Dear E.Mi,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug