Bug 106581 - Pdfium based insertion of a PDF as image results in unwanted solid background
Summary: Pdfium based insertion of a PDF as image results in unwanted solid background
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
(earliest affected) release
Hardware: All All
: medium enhancement
Assignee: Not Assigned
: 108726 (view as bug list)
Depends on:
Blocks: PDF-Import-Draw PDF-Insert
  Show dependency treegraph
Reported: 2017-03-16 19:44 UTC by sergio.callegari
Modified: 2020-10-28 23:17 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:

drawing illustrating the issue (130.65 KB, application/vnd.oasis.opendocument.graphics)
2017-03-16 19:44 UTC, sergio.callegari
Dummy text pdf for testing (14.83 KB, application/pdf)
2017-03-23 11:54 UTC, Buovjaga
Demo of inserting an image with no background when the page has a background color (250.32 KB, image/png)
2020-06-07 08:17 UTC, sergio.callegari

Note You need to log in before you can comment on or make changes to this bug.
Description sergio.callegari 2017-03-16 19:44:05 UTC
Created attachment 131946 [details]
drawing illustrating the issue

Libreoffice 5.3 provides the possibility to insert a PDF image in a document.
Unfortunately, the result is very inconsistent with the original PDF, showing the same problem that arises when opening (rather than inserting) the PDF image with LibO draw.

This is somehow expectable given that the "filter" appears to be the same, according to http://vmiklos.hu/blog/lo-insert-pdf-image.html which states "As you can see, the PDF import-as-graphic filter isn’t too complicated, it completely reuses the existing "import PDF into Draw" filter, it simply copies the first page of the resulting document model as a metafile" (regardless of the comments in bug 104648 which states that the two are unrelated "Not at all. They use two completely different import filters meeting different requirements.").

IMHO this is an occasion being lost. Would have been much more useful to rely on some PDF rendering tool (such as poppler) when "inserting" and doing the conversion to a native metafile only when "opening" or "inserting + breaking". This would have provided 100% fidelty on the "insert" case.

In the attachment a demo of the poor insertion. Page 1 is the PDF insertion. Page 2 is the same PDF converted to SVG with inkscape and inserted as SVG.
Comment 1 Buovjaga 2017-03-23 11:52:54 UTC
What about this, then: http://vmiklos.hu/blog/pdfium.html
Did you check with 5.4? If there is still an issue, plz attach the pdf.
Comment 2 Buovjaga 2017-03-23 11:54:28 UTC
Created attachment 132097 [details]
Dummy text pdf for testing

Well I do repro.

Win 7 Pro 64-bit Version:
Build ID: 1670cc25bc2771e87f7956a4b0dd634abaa4128b
CPU threads: 4; OS: Windows 6.1; UI render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-03-22_23:28:42
Locale: fi-FI (fi_FI); Calc: CL
Comment 3 Buovjaga 2017-03-23 12:46:03 UTC
Note: this is a limitation of pdfium: it gives bitmap result now.
Comment 4 Miklos Vajna 2017-03-23 12:48:01 UTC
I imagine the request here is to have correct output (what pdfium gives on 5.4) and vector-based (what importing into Draw gives on 5.3) at the same time. Probably what can give us is skia, but that'll take some time: the skia output in pdfium is currently experimental, also so far there is no skia integration in LO, either.

But as an enhancement idea, makes sense. :-)
Comment 5 sergio.callegari 2017-03-23 13:33:36 UTC
Yes, the idea I'm proposing is to have

1) pdfium behaviour when you insert a pdf as an image: perfect rendering as a bitmap on screen and as the original pdf when exporting to pdf or printing.

2) current behaviour when you open a pdf in draw.

3) current behavior when you insert a pdf as an image, but then you ask LibO to "break" it into its components.

In other words, the proposal is "perfect rendering" when there is no need to edit and acceptance of some rendering compromise when editing is needed.
Comment 6 sergio.callegari 2017-03-23 13:36:15 UTC
If this is the 5.4 behavior (haven't tested it yet), I think it is a great step forward. And if further on skia can help having an editable behaviour with far less artifacts, even better. Thanks for the good work!
Comment 7 Buovjaga 2017-03-23 14:37:37 UTC
I only now checked on 5.3 and saw that the "poor quality" is not about rendering quality, but the lines not being justified.
Comment 8 sergio.callegari 2017-03-23 15:54:55 UTC
Not only "not justified". The character positioning can vary widely (and probably the font too). This means that a technical drawing, or a graph, where the relative positioning of the text and the other elements is fundamental, can get completely messed up. In some cases, where you need to import a large PDF image, you can get to the point where part of the text goes off the page.
Comment 9 sergio.callegari 2017-08-25 13:32:43 UTC
The 5.4 behavior is "almost" there.

1) The pdf "import" via pdfium looks fine, but has a very big problem in that it creates a white opaque background even for pdf images that have no background. This means that 5.4 cannot be used to import pdf images when they need to appear over a colored background or over another image or pattern.

2) When importing in 5.4 the break option is gone. I think that it should be restored keeping the old 5.3 codepath for it.
Comment 10 sergio.callegari 2020-05-31 10:26:30 UTC
Now evident that these are really two independent issues.

- Unwanted solid background is still there as of LibO 7.0 beta 1.

- Ability to break the image to make it editable is now restored in LibO 6, but broken again in 7 beta 1. Now this has its own bug 133547
Comment 11 sergio.callegari 2020-06-07 08:17:49 UTC
Created attachment 161711 [details]
Demo of inserting an image with no background when the page has a background color

See the "white box" effect when inserting an image with no background
Comment 12 V Stuart Foote 2020-09-08 16:50:35 UTC
Luboš, Miklos, *

A bit fuzzy as to the lash ups, but regards comment 4 are we getting to a point we can start thinking about PDFium onto Skia backend for vector content?

Otherwise, could we start handling the PDFium raster output at higher resolution (bug 115811) and with functional transparency--does Skia get us there sooner?
Comment 13 Miklos Vajna 2020-09-09 09:16:56 UTC
See <https://bugs.documentfoundation.org/show_bug.cgi?id=115811#c14>, just rendering at higher resolution is not a great idea.

For vector output, there are two problems:

1) Skia is the runtime-default on Windows, it's disabled on Linux/macOS by default; so in case pdfium wants to do skia API calls unconditionally, we're not there yet.

2) pdfium itself defaults to agg to produce pixel output, its skia backend is still experimental, see <https://pdfium.googlesource.com/pdfium/+/refs/heads/master/README.md#selecting-build-configuration>.

Both are possible to solve long-term, but don't expect an instant fix.
Comment 14 sergio.callegari 2020-09-09 09:32:56 UTC
Is making a bitmap with an alpha channel possible with the current infrastructure?
If so, that would be important in the short term, otherwise it is almost impossible to use PDF images in presentations (that often have a colored background).
Comment 15 Miklos Vajna 2020-09-09 11:23:23 UTC
I don't think it's impossible with the bitmap-based approach we currently have. Likely it's a matter of debugging what's the problem & fixing it.
Comment 16 V Stuart Foote 2020-10-28 23:17:19 UTC
For bug 133547, seems the break is functioning correctly now for an inserted PDF image.  Specific resulting LO Draw objects get scattered over canvas (anchoring issues by object type), but the core 'break' seems functional.

So, issue here remains the lack of an Alpha channel with the inserted image rendered to canvas as BMP raster meta, and the fixed ppi default at 96ppi of that raster.

And, it would be nice if the pdfium optionally supported insertion as vector directly. And if  'break' to Draw objects could gain improved fidelity to original layout.
Comment 17 V Stuart Foote 2020-10-28 23:17:28 UTC
*** Bug 108726 has been marked as a duplicate of this bug. ***