Bug 150777 - PDF export converts 1bpp images to 8bpp RGB
Summary: PDF export converts 1bpp images to 8bpp RGB
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
7.5.0.0 alpha0+
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2022-09-04 15:00 UTC by Jussi Pakkanen
Modified: 2022-10-17 15:08 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
A Writer document with an embedded 1 bit image. (12.91 KB, application/vnd.oasis.opendocument.text)
2022-09-04 15:00 UTC, Jussi Pakkanen
Details
A 1 bit monochrome PNG. (1.79 KB, image/png)
2022-09-04 15:01 UTC, Jussi Pakkanen
Details
The exported PDF which has an rgb image instead of monochrome (13.17 KB, application/pdf)
2022-09-04 15:02 UTC, Jussi Pakkanen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jussi Pakkanen 2022-09-04 15:00:49 UTC
Created attachment 182200 [details]
A Writer document with an embedded 1 bit image.

Steps to reproduce:

- create a 1 bit bw PNG image
- create a Writer document and embed the image
- export to PDF
- inspect the generated PDF

The image should be written as 1 bit BW image but LibreOffice converts it to grayscale. This is what gets written in the PDF file:

<</Type/XObject/Subtype/Image/Width 640/Height 480/BitsPerComponent 8/Length 5 0 R
/Filter/FlateDecode/ColorSpace/DeviceRGB

The output is an 8 bits per pixel image in the RGB colorspace when it should be a 1 bit per pixel image in the grayscale color space.

This only affects monochrome images. Grayscale images seem to be exported properly in grayscale.

Looking at the code in pdfwriter_impl.cxx there is code to properly handle 1 bit images, which should get written out with CCIT compression.

On line 8859 the code detects the bit depth with this:

nBitsPerComponent = vcl::pixelFormatBitCount(ePixelFormat);

so it seems that this function returns 8 instead of 1 for this monochrome image at least.

Files showing the issue have been attached.
Comment 1 Jussi Pakkanen 2022-09-04 15:01:45 UTC
Created attachment 182201 [details]
A 1 bit monochrome PNG.
Comment 2 Jussi Pakkanen 2022-09-04 15:02:30 UTC
Created attachment 182202 [details]
The exported PDF which has an rgb image instead of monochrome
Comment 3 V Stuart Foote 2022-09-04 18:54:30 UTC
Confirmed on Windows

Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: dc92a4d973086ce8a6a5f75ba0f4d4c9ca05537a
CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded
Comment 4 Paris Oplopoios 2022-09-05 08:09:21 UTC
I will look into this issue later today
Comment 5 Tomaz Vajngerl 2022-09-05 09:13:03 UTC
(In reply to Paris Oplopoios from comment #4)
> I will look into this issue later today

No need - it's not related to PNG export anyway.
Comment 6 Tomaz Vajngerl 2022-09-05 09:36:09 UTC
This issue is "by design" - at least partially. I'm not sure it can be addressed in the short term at least. 

The new libpng based PNG reader doesn't read to a 1-bit and other palette based Bitmap formats anymore. This is generally to minimize the number of esoteric Bitmap formats we need to support and to improve the performance as the back-ends don't support palette surfaces in general anymore.

So what happens in this case is that we read in the 1-bit PNG into Bitmap, but as we don't want 1-bit Bitmap, we just enable the switched in libpng to convert from 1 to 8bit and from palette to RGB. The result of that is that we read-in an 8-bit RGB Bitmap. Then at PDF export we can't just reuse and store the PNG, which is still associated (I think), because PDF uses it's own lossless format, so we instead look and write the content of the Bitmap - which will be RGB 8-bit.

What we can do here is to check the original PNG file or store some "origin" information into the Bitmap class, and then convert the Bitmap to a palette based 1-bit Bitmap (which shouldn't cause any loss of information).
Comment 7 V Stuart Foote 2022-09-05 12:31:13 UTC
(In reply to Tomaz Vajngerl from comment #6)
> What we can do here is to check the original PNG file or store some "origin"
> information into the Bitmap class, and then convert the Bitmap to a palette
> based 1-bit Bitmap (which shouldn't cause any loss of information).

We do keep the original 1bpp image, placing it in the ODF archive's Pictures directory.

For now though, the resulting PDF is simply larger bcz of embedding an 8bpp bitmap into a stream. But do we need to generate PDF with 1bpp images? 

Other than size of the PDF what is the justification? I guess for "fidelity" to the original--but then PDF is a presentation format not an exchange format. The 8bpp grayscale will look better than 1bpp. 

And in this sample we're working with the full image on document canvas--what happens when the original image has been masked, clipped or otherwise edited (i.e. image filtered). An 8bpp or 24bpp image is going to be required then, right?

So unless this is really easy, then reasonable to => WF
Comment 8 Jussi Pakkanen 2022-09-05 13:14:21 UTC
> I guess for "fidelity" to the original--but then PDF is a presentation format not an exchange format. The 8bpp grayscale will look better than 1bpp.

This is actually quite important. When sending things to print it is important to preserve 1 bit images as such. They are printed "directly" whereas grayscale and RGB images are interpolated causing the edges between black and white to become blurry.

At the very least LO should convert the 1 bit image to grayscale, not RGB.

I agree that in general indexed images are not worth supporting. However 1 bit images are an exception here, because they have widespread still today.
Comment 9 Jussi Pakkanen 2022-09-05 13:27:43 UTC
A practical example. Suppose you have a document that has a line art illustration with cross hatching and the like. You _really_ want to print that so that all edges remain sharp, otherwise it will look mushy and ugly. The only way to guarantee this is to put a 1-bit image in the PDF. If it is in grayscale or RGB you are at the mercy of the RIP. If it has code that detects RGB and grayscale images that "happen to be" monochrome you might get crisp looking output. But it's a lottery at that point.
Comment 10 Tomaz Vajngerl 2022-09-05 13:45:12 UTC
(In reply to V Stuart Foote from comment #7)
> We do keep the original 1bpp image, placing it in the ODF archive's Pictures
> directory.

Yes, the original is available, but you don't put PNG inside PDF - so reading again would again produce a 8-bit RGB Bitmap. 
 
> For now though, the resulting PDF is simply larger bcz of embedding an 8bpp
> bitmap into a stream. But do we need to generate PDF with 1bpp images? 

No - I don't think so.

> Other than size of the PDF what is the justification? I guess for "fidelity"
> to the original--but then PDF is a presentation format not an exchange
> format. The 8bpp grayscale will look better than 1bpp. 

It will look exactly the same - no matter if it's 1bpp or 8bpp.
 
> And in this sample we're working with the full image on document
> canvas--what happens when the original image has been masked, clipped or
> otherwise edited (i.e. image filtered). An 8bpp or 24bpp image is going to
> be required then, right?

If the Bitmap is changed in any way (resized for example), it's converted to 24-bit RGB. If it's just metadata (crop for example) then it depends - not sure what happens with PDF. 
 
> So unless this is really easy, then reasonable to => WF

Well, it's an issue, but low priority.
Comment 11 Tomaz Vajngerl 2022-09-05 13:59:42 UTC
(In reply to Jussi Pakkanen from comment #8) 
> At the very least LO should convert the 1 bit image to grayscale, not RGB.

This should be possible by fudging the PNG importer[1] to create such a Bitmap. The problem is 1-bit doesn't necessarily mean black and white... it can be for example red and green, so you need to detect that it is actually black and white (or that both are a gray value).

[1] vcl/source/filter/png/PngImageReader.cxx