Bug 154255 - Support embedded PDF viewing and zooming as vectorial instead of rasterized image
Summary: Support embedded PDF viewing and zooming as vectorial instead of rasterized i...
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
7.5.0.3 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: needsDevAdvice
Depends on:
Blocks: PDF-Insert
  Show dependency treegraph
 
Reported: 2023-03-18 11:17 UTC by medmedin2014
Modified: 2023-06-13 17:49 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description medmedin2014 2023-03-18 11:17:03 UTC
When we import a technical PDF drawing that contains small details, the embedded PDF is replaced by a rasterized image inside Writer, the problem is that we don't want to export to PDF to be able to view those small details, but we would like to view them inside Writer. We know that there is PDFIMPORT_RESOLUTION_DPI env var that can be set to higher value to obtain rasterized substitution with higher resolution, but it's not practical at all to tell every user to set a specific DPI on each machine either at launch time or in LO options dialog.

It would be awesome to be able to view embedded PDFs as vector image like what we already have with SVG format, because PDF format keeps original embedded fonts and colors and produce high quality output without any distortion.
Comment 1 V Stuart Foote 2023-03-19 17:41:29 UTC
Of course this is not opened via import, rather inserted via the pdfium filter. The filter converts to bitmap raster image not to vector.

The PDFIMPORT_RESOLUTION_DPI is an os/DE environment variable that can be set for each workstation, not necessarily per user.  E.g. on managed Windows systems via SCCM registry entry, or distributed as a regedit script.

So facilitating zoom to view higher details can already be done via the environment variable, preset system or on launch.

We have bug 115811 open to allow dynamic set resolution from the UI, possibly even "one idea is to make this dynamic. That is, re-render the page in higher DPI as the user zooms in" from c#6

So not at all clear that rendering the PDF to vcl document canvas as vector would ever make sense for Writer. Though it already is on the Draw canvas when you 'break' the inserted PDF into disassembled draw objects--that is too chaotic to work with efficiently (regrouping into meaningful text and layout) on a text document.

Not saying it can't be done (i.e. working directly to a skia vector canvas), but
if one really needs the PDF as vector on the Writer canvas, then conversion to SVG of individual PDF pages *external* to LibreOffice and then insert as SVG would be the more supportable work flow.

The project is obliged to handle the SVG inserts correctly (it is part of ODF 1.3) no such standard for handling PDF source images--just what pdfium project libs provide which at the moment for LibreOffice is into bitmap.
Comment 2 Miklos Vajna 2023-03-20 07:37:31 UTC
> Not saying it can't be done

My guess would be that similar to how we rasterize EMF/WMF only at the last minute, we could do the same with PDF as well. It's just not yet done.
Comment 3 Heiko Tietze 2023-03-20 08:00:44 UTC
I agree with Stuart that splitting the PDF into drawing objects is not efficient on Writer (though working well on Draw; and you can insert the PDF via Draw object).

And I struggle a bit with the use case. Your Writer document is WYSIWYG, you shouldn't expect different zoom level for details. So the question boils down whether we can dynamically adjust the DPI on loading. And Miklos replied with a clear "maybe".
Comment 4 medmedin2014 2023-03-20 10:42:02 UTC
(In reply to Heiko Tietze from comment #3)
> I agree with Stuart that splitting the PDF into drawing objects is not
> efficient on Writer (though working well on Draw; and you can insert the PDF
> via Draw object).
> 
> And I struggle a bit with the use case. Your Writer document is WYSIWYG, you
> shouldn't expect different zoom level for details. So the question boils
> down whether we can dynamically adjust the DPI on loading. And Miklos
> replied with a clear "maybe".

A PDF is simply a vector drawing and is totally different from a raster image, and it embeds information that should be read at any zoom level like how we can zoom SVG or text. Draw is too much buggy and never succeed to open any PDF without any distortion, it's not up yet to the task.

I'm not talking about splitting or converting the whole PDF to vectors and edit them, I simply asked if possible to view the content of embedded PDFs in high quality while zooming like how we can view SVG content, is it not possible to do it using any light PDF viewer like PDF.js to render the high quality view while zooming inside Writer ?
Comment 5 Ole Tange 2023-06-13 17:02:33 UTC
(In reply to Heiko Tietze from comment #3)

> And I struggle a bit with the use case. Your Writer document is WYSIWYG, you
> shouldn't expect different zoom level for details.

Every lawyer has a use case, and probably many more people.

We need to import hundreds of pages of PDF files into a single file.

All that will be added is a page number for the total document, and some cross reference links into each page, so you can easily go from an index to a given page in the PDF.

The document becomes unwieldly large if each PDF-file is converted to PNG in a resolution that is readable (i.e. > 200 DPI). Try that with 500 pages, which is my latest task.

I would love if LibreOffice could be an alternative to PDFarranger https://github.com/pdfarranger/pdfarranger
So I could merge PDF-files without loss of quality and add small details like a total page number.

It is perfectly fine if I cannot "enter into" a PDF page, so from a user perspective the pages could be treated similar to a write protected SVG image.

But I would really love if I could re-arrange pages like I can with PDFarranger.
Comment 6 V Stuart Foote 2023-06-13 17:49:32 UTC
(In reply to Ole Tange from comment #5)

> 
> But I would really love if I could re-arrange pages like I can with
> PDFarranger.

LibreOffice is *NOT* a PDF editor. And is only a viewer in the sense that we parse content from the PDF--the pdfium based rasters (no Skia vector canvas yet), or for the clunky conversion of content to ODF draw objects.

The original PDF is retained, but the rendering to LO document canvas is not bulk PDF internals. Currently the pdfium based parser works one page at a time, so you have to split the pages out external to LibreOffice, e.g. PDFtk and add them in the sequence you need. Scriptable but bug 114234 is open to improve multi-page PDF.