Bug 96892 - Low precision in PDF export of SVG images leads to visible artifacts at high zoom levels
Summary: Low precision in PDF export of SVG images leads to visible artifacts at high ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
4.0 all versions
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:6.1.0 target:6.3.0 target:6.2.1
Keywords: filter:pdf
: 57021 109025 115446 (view as bug list)
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2016-01-04 19:25 UTC by Martin
Modified: 2022-08-20 16:12 UTC (History)
10 users (show)

See Also:
Crash report or crash signature:


Attachments
test file (299.59 KB, application/vnd.oasis.opendocument.text)
2016-01-04 19:25 UTC, Martin
Details
screentshot of PDF at 1600% zoom (476.67 KB, image/png)
2016-01-04 19:31 UTC, Martin
Details
pdf test files without and with patch. (59.83 KB, application/zip)
2016-01-04 20:10 UTC, Julien Nabet
Details
tab-delimited file comparing coordinates without and with patch (5.16 KB, text/plain)
2016-01-05 17:48 UTC, Martin
Details
PDF_VectorQuality_Test.odt (19.27 KB, application/vnd.oasis.opendocument.text)
2017-12-02 13:24 UTC, Kai Struck
Details
notepadd++ edit of uncompressed PDFs, left old precision right new precision (68.43 KB, image/png)
2018-02-07 18:51 UTC, V Stuart Foote
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin 2016-01-04 19:25:17 UTC
Created attachment 121717 [details]
test file

The attached test.odt contains two small images, an SVG on the left and a very similar PNG on the right. It also contains a line of 10pt size text for scale.

Steps to reproduce:

1. Open test.odt in Writer

2. File -> Export as PDF

3. View the resulting PDF at 1600% zoom or higher in LibreOffice Draw, Acrobat Reader or Inkscape

4. Lines in the SVG image (the one on the left) are noticeably wiggly, but should in fact all be perfectly straight


Programs affected: Writer and Draw (haven't tested Calc)

Versions affected: I have noticed similar effects in all previous versions of LibreOffice since vector export of SVG images was first introduced. My current version is as follows:

LibreOffice version: 5.0.3.2
Build ID: 1:5.0.3~rc2-0ubuntu1~precise2


Possible cause: 

When you look at the resulting PDF in Draw or Inkscape, at zooms around 3000% you can clearly see that all control points are placed on a grid. This is not the case in the original SVG image.

When you look at the uncompressed PDF, which can be obtained using either pdfunzip or via

$ pdftk test.pdf output uncomp.pdf uncompress

you will see that the data are all rounded up (truncated?) to one decimal place. I would argue that this is insufficient - e.g. my current version of Inkscape saves three decimal places by default. Apparently so does Adobe Distiller - see the discussion here:

https://bugs.freedesktop.org/show_bug.cgi?id=23364


Relevant code:

I suspect that this could be due to lines 512 and 513 of vcl/source/gdi/pdfwriter_impl.cxx in core:

static const sal_Int32 nLog10Divisor = 1;
static const double fDivisor = 10.0;

The first constant is then used in PDFWriterImpl::PDFPage::appendPixelPoint() as the precision argument when calling appendDouble. 

If this is the cause, I would argue that the constants should be increased to 3 and 1000.0, respectively.
Comment 1 Martin 2016-01-04 19:31:07 UTC
Created attachment 121718 [details]
screentshot of PDF at 1600% zoom
Comment 2 Martin 2016-01-04 19:57:42 UTC
I have also tested the file on Windows 8 and LibreOffice 4.2.8.2, the result is the same - at 1600% zoom the lines in the PDF are wiggly rather than straight.
Comment 3 Julien Nabet 2016-01-04 20:10:11 UTC
Created attachment 121719 [details]
pdf test files without and with patch.

Thank you Martin for your detailed analysis!

I gave a try to your patch but as you can see with the attachment, the result seems worse :-(

However, would you be interested in contributing on LO dev?
Indeed you may submit your patch to review and improve it.
See https://wiki.documentfoundation.org/Development for more information.
Comment 4 Julien Nabet 2016-01-04 20:10:58 UTC
On pc Debian x86-64 with master sources updated yesterday, I confirm that the lines aren't straight with a big zoom (eg: 1600%)
Comment 5 Martin 2016-01-05 16:19:11 UTC
Thank you for the files, Julien.

Long story short (see the next post for details), your files eventually led me to what I think is a bug in appendFixedInt() in core/vcl/source/gdi/pdfwriter_impl.cxx :


    sal_Int32 nFactor = 1, nDiv = nPrecision;
    while( nDiv-- )
        nFactor *= 10;

    sal_Int32 nInt      = nValue / nFactor;
    rBuffer.append( nInt );
    if( nFactor > 1 )
    {
        sal_Int32 nDecimal  = nValue % nFactor;
        if( nDecimal )
        {
            rBuffer.append( '.' );
            // omit trailing zeros
            while( (nDecimal % 10) == 0 )
                nDecimal /= 10;
            rBuffer.append( nDecimal );
        }
    }


The problem with this code is that e.g. for nValue = 105, nPrecision = 2 it gives nDecimal = 5, so it outputs "1.5" rather than the correct "1.05"!

The next function, appendDouble(), does essentially the same thing, but does it correctly (as far as I can see!).

This looks like very old code, the bulk of it is from 2004 with some changes from 2005.

As far as I can see, the buggy appendFixedInt() is then used by PDFWriterImpl::PDFPage::appendPoint(), which in turn is used by just about everything - drawLine, drawRectangle, drawJPGBitmap etc.

I think they hit the bug early on and kept fDivisor at 10.0 ever since, because nPrecision = 1 is the only case where the code works correctly.

More details to follow in the next post.
Comment 6 Martin 2016-01-05 17:48:25 UTC
Created attachment 121733 [details]
tab-delimited file comparing coordinates without and with patch
Comment 7 Martin 2016-01-05 17:52:57 UTC
Here's what I did to find the bug, and some things I noticed along the way.

First I took Julien's files and uncompressed the streams using pdftk. As expected, some numbers are now given to three places, e.g. at the very beginning:

stream
0.1 w
q 0 0.1 595.2 841.8 re

became:

stream
0.1 w
q 0 0.138 595.275 841.861 re


But the rest is less encouraging, e.g. the first long data line:

63.8 756.6 64.1 756.1 64.2 755.9 c 64.3 755.7 64.4 755.5 64.5 755.3 c

became:

63.8 756.6 64.5 756.15 64.15 755.95 c 64.3 755.75 64.4 755.55 64.5 755.3 c

It looks like the "with patch" numbers are only rounded to the nearest 1/20. More on that later. You can also see that the third number changed quite dramatically, 64.1 -> 64.5 -- and this is what led me to the bug.

I extracted all the numbers for one path and compared "without patch" and "with patch". The results are attached. I found that |difference| > 0.05 only happened in the following cases:

# without patch         with patch      difference
59.1    59.5    0.40
60.1    60.5    0.40
61.1    61.5    0.40
62.1    62.5    0.40
63.1    63.5    0.40
64.1    64.5    0.40
65.1    65.5    0.40
66.1    66.5    0.40
67.1    67.5    0.40
754     754.5   0.50
755     755.5   0.50
756     756.5   0.50
757     757.5   0.50
758     758.5   0.50
759     759.5   0.50
760     760.5   0.50
761     761.5   0.50
762     762.5   0.50

Once you've seen the bug, the explanation is obvious.

0.40 vs 0.50 - the 6x are x-coordinates, the 7xx are y-coordinates. In general the y-coordinates decreased by 0.05, x-coordinates increased by 0.05:

58.6    58.6    0.00
58.7    58.65   -0.05
58.9    58.85   -0.05
58.9    58.9    0.00
59.1    59.1    0.00
:
753.4   753.45  0.05
753.5   753.5   0.00
753.7   753.7   0.00
753.9   753.9   0.00
753.9   753.95  0.05
754     754     0.00

I believe that the difference comes from this line in appendPoint():

nValue      = pointToPixel(getHeight()) - aPoint.Y();

or a similar line elsewhere.
Comment 8 Martin 2016-01-05 18:37:18 UTC
Now going back to the original issue - why aren't the "with patch" numbers being printed out to three decimal places?

One possible explanation is that they are, but the trailing zeros are omitted - and that there are always trailing zeros!

There's code for omitting trailing zeros in appendFixedInt() (sign-posted by the comment) and the equivalent in appendDouble() is the "&& nFrac" in the final for-loop.

This would mean that the values sent for output have already been rounded up (or more likely truncated) to the nearest 1/20. This could be due to lcl_convert() used in appendPoint().

I'm currently trying to figure out what's going on with those calls, it looks quite complicated. Any pointers would be appreciated. I can sort of see why the numbers would be rounded to 1/10 (via nDPIX = nDPIY = 720 plus the fact that 1 inch = 72 pt), but can't find the factor of 2.

If I manage to figure this out, I will try to write a patch addressing all of this.
Comment 9 Julien Nabet 2016-01-05 18:54:09 UTC
Martin: on which env are you? If on Linux, you must know that's the env where building LO is the most easy. I may help you a bit if needed.
Indeed, your investigation would be easier if you had the sources.
Comment 10 Julien Nabet 2016-01-05 20:12:49 UTC
(In reply to Julien Nabet from comment #9)
> Martin: on which env are you? If on Linux, you must know that's the env
> where building LO is the most easy. I may help you a bit if needed.
> Indeed, your investigation would be easier if you had the sources.

Sorry for the question about env, you already indicated it. The rest is still relevant anyway :-)
Comment 11 Julien Nabet 2016-01-05 20:15:03 UTC
Noticing you didn't mention it, I thought you might be interested in using http://opengrok.libreoffice.org/ to search in LO code.
Comment 12 Kai Struck 2017-01-03 11:58:29 UTC
This bug seems the same as
https://bugs.documentfoundation.org/show_bug.cgi?id=57021

Martin was trying to solve it here. Where did you go?
Comment 13 Kai Struck 2017-12-02 13:19:25 UTC
(In reply to Martin from comment #5)
> Thank you for the files, Julien.
> 
> Long story short (see the next post for details), your files eventually led
> me to what I think is a bug in appendFixedInt() in
> core/vcl/source/gdi/pdfwriter_impl.cxx :
> 
> 
>     sal_Int32 nFactor = 1, nDiv = nPrecision;
>     while( nDiv-- )
>         nFactor *= 10;
> 
>     sal_Int32 nInt      = nValue / nFactor;
>     rBuffer.append( nInt );
>     if( nFactor > 1 )
>     {
>         sal_Int32 nDecimal  = nValue % nFactor;
>         if( nDecimal )
>         {
>             rBuffer.append( '.' );
>             // omit trailing zeros
>             while( (nDecimal % 10) == 0 )
>                 nDecimal /= 10;
>             rBuffer.append( nDecimal );
>         }
>     }
> 
> 
> The problem with this code is that e.g. for nValue = 105, nPrecision = 2 it
> gives nDecimal = 5, so it outputs "1.5" rather than the correct "1.05"!

I think this nailed it. The code should append 05 but of course only appends 5.
Meanwhile the code has been simplified to only support 1 decimal according to:
http://opengrok.libreoffice.org/

It's a pity because this prevents LO to export fine detailed Vector-PDFS. I attached a testfile.
Comment 14 Kai Struck 2017-12-02 13:24:18 UTC
Created attachment 138178 [details]
PDF_VectorQuality_Test.odt

This is a Writer .odt with embedded SVG-graphics in different sizes to test the LO-PDF-Quality. You might have to use a PDF-Reader with high zoom-capabilities to be able to see the results.
Comment 15 V Stuart Foote 2018-02-04 22:27:22 UTC
*** Bug 109025 has been marked as a duplicate of this bug. ***
Comment 16 V Stuart Foote 2018-02-04 22:29:41 UTC
*** Bug 115446 has been marked as a duplicate of this bug. ***
Comment 17 V Stuart Foote 2018-02-04 22:30:47 UTC
*** Bug 57021 has been marked as a duplicate of this bug. ***
Comment 18 V Stuart Foote 2018-02-04 22:45:17 UTC
FYI, a good test case from dupe bug 109025 attachment 134558 [details]

Caolán, Miklos -- you took care of the modulo glitch from the AppendFixedInt() snippet comment 5 with:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=cd5cc12d4330d68d0a233a82eda30e983ce202a4

and

https://cgit.freedesktop.org/libreoffice/core/commit/?id=5f6065f980756fdb81c7018bedbb7f54e2b8214a

but it seems we _do_ need the greater precision to handle EMF and SVG vectors on export.
Comment 19 Caolán McNamara 2018-02-05 10:28:33 UTC
https://gerrit.libreoffice.org/#/c/49227/ would (I believe) fix appendFixedInt miscalculation and bump to 3 digits precision.
Comment 20 Commit Notification 2018-02-06 09:28:19 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2113de51158a6e6c14931109bb9a4e27303c0eab

tdf#96892 higher precision pdf fixed ints

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Caolán McNamara 2018-02-06 09:31:41 UTC
can test and see if thats sufficient
Comment 22 V Stuart Foote 2018-02-07 18:08:43 UTC
OK, on Windows 10 Ent 64-bit en-US with
Version: 6.1.0.0.alpha0+ (x64)
Build ID: b1069ea6f25daa268eb4358d5ea20094b46ef347
CPU threads: 8; OS: Windows 10.0; UI render: GL; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2018-02-06_23:49:25
Locale: en-US (en_US); Calc: group

Fidelity at high zoom looks much better now! Curves and line have fewer vertices misplaced--not perfect, but now presentable. 

Also, I uncompressed the PDF streams with pdftk.  Produced new PDF from sample ODT, along with old PDF. Uncompressed and comparing the two we are now getting point/vertex with 3 decimal place precision, no rounding. Before they were 1 decimal place and seemed not to be truncated.

Only possible glitch I saw is in the /MediaBox entry of the header, it is now showing with 12 decimal place precision. That could be the pdftk uncompress, otherwise not sure if that would be an issue--just seems strange.

@Kai -- please do some testing and let us know.

@Julien, @Artem, any opinion on the handling now?

@Martin, you still with us?
Comment 23 V Stuart Foote 2018-02-07 18:17:20 UTC
(In reply to V Stuart Foote from comment #22)

> they were 1 decimal place and seemed not to be truncated.
> 
s/seemed.*/seemed not to be rounded, just truncated./
Comment 24 V Stuart Foote 2018-02-07 18:51:19 UTC
Created attachment 139672 [details]
notepadd++ edit of uncompressed PDFs, left old precision right new precision

Export to PDF of Regina's attachment 128554 [details] (see bug 103767 bug 100986):

On left is PDF export from 6.0.0.3, pdftk uncompressed PDF open in Notepad++

On right is PDF export from 6.1.0.0 (2018-02-07, b1069ea6f25daa268eb4358d5ea20094b46ef347), pdftk uncompressed PDF open in Notepad++

vertices are now to 3 decimal place precision, and some of the odd truncation/rounding has been removed.
Comment 25 Caolán McNamara 2018-02-07 20:48:53 UTC
FWIW
export VCL_DEBUG_DISABLE_PDFCOMPRESSION=1
(or windows equivalent to set environmental variable VCL_DEBUG_DISABLE_PDFCOMPRESSION to something non 0)
and exporting to pdf should export uncompressed pdf directly from LibreOffice to avoid having to use another tool to decompress it
Comment 26 V Stuart Foote 2018-02-08 04:50:57 UTC
(In reply to Caolán McNamara from comment #25)
> FWIW
> export VCL_DEBUG_DISABLE_PDFCOMPRESSION=1
> (or windows equivalent to set environmental variable
> VCL_DEBUG_DISABLE_PDFCOMPRESSION to something non 0)
> and exporting to pdf should export uncompressed pdf directly from
> LibreOffice to avoid having to use another tool to decompress it

Thanks, great hint! 

Set in Windows adding a new String Value in the Key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Environment

And, the /MediaBox excess decimal precision is not present in the LO exported PDF, so an issue with PDFtk.
Comment 27 Kai Struck 2018-02-08 10:25:24 UTC
Much better results now with
master~2018-02-06_23.16.56_LibreOfficeDev_6.1.0.0.alpha0_Linux_x86-64_deb.tar.gz

on Linux Mint Mint Sarah 18 64bit then with any LO4 or LO5 before.
Tested with 
http://bugs.documentfoundation.org/attachment.cgi?id=138178

(look at the grid)

and with musical scores with vector graphics as well as with Draw objects.
The resolution now allows acceptable output for common music score sizes.
But I think I would prefer a 5 decimal place precision if that's possible.
Comment 28 Caolán McNamara 2018-02-15 14:35:02 UTC
5 decimal places is possible. I'm not sure it will result in better quality, might be just more digits without more precision if our internal precision is exhausted at that level. There's also the possibility that large numbers get multipled so much they get clipped by upper integer bounds.

I'll push it though, and mark this closed and if any problem show up then revert that follow up precision effort
Comment 29 Commit Notification 2018-02-15 14:35:27 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c4b23192b4ab1f3ea75df7e48da36b6b17de248b

tdf#96892 3 to 5 digit precision

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 30 V Stuart Foote 2018-02-16 06:23:29 UTC
(In reply to Commit Notification from comment #29)
> 
> Affected users are encouraged to test the fix and report feedback.

I can't tell if this really is a problem, but in looking at text of uncompressed PDF files--of the coordinate pars for the vertices the Y-value consistently is just 2, 1 or even no decimal place. While the X-value is almost always the full 5 decimal place precision.

Some of the Y-values for pairs do show full 5 digit precision--e.g. the drawRectangle bounding box. 

But seems like more of the Y-value vertices should get 5 digit precision.
Comment 31 Kai Struck 2018-02-17 12:58:08 UTC
Tested master~2018-02-17_01.09.34_LibreOfficeDev_6.1.0.0.alpha0_Linux_x86-64_deb.tar.gz
on Linux Mint 18 Sarah 64bit. 

Mh, I can see no visual differences in fine Vector art to the previous build with 3 decimal precision.
(master~2018-02-06_23.16.56_LibreOfficeDev_6.1.0.0.alpha0_Linux_x86-64_deb.tar.gz)


If I look at the coordinates in an (with pdftk) uncompressed pdf the last 2 digits  of the 5 decimals are ALWAYS the same through the whole pdf eg. 
345.23407
365.86507
23.11107

In another PDF the last 2 digits are e.g. always "14"
345.23414
365.86514
23.11114

Somehow only y-values have 5 decimal digits whereas x-values mostly have 2 decimal digits.

So for now I'd also be fine with 3 decimal precision which really looks good.
Comment 32 Commit Notification 2018-02-23 20:05:45 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=36bd5662e73f6fbd64bf03fd42936a8d69ed397b

Revert "tdf#96892 3 to 5 digit precision"

It will be available in 6.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 33 Kai Struck 2018-02-27 11:59:08 UTC
Tested master~2018-02-27_02.06.19_LibreOfficeDev_6.1.0.0.alpha0_Linux_x86_deb.tar.gz
on Linux Mint 18 Sarah 64bit. 

Looks good. Definitely better than 4er and 5er LO.
Comment 34 Commit Notification 2019-01-24 09:26:28 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/fddd956c0cf3b2c22a152bbb30554def1336b466%5E%21

tdf#96892 vcl: add unit test for misplaced soft-hyphen ...

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Commit Notification 2019-01-26 10:45:04 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-6-2":

https://git.libreoffice.org/core/+/897c6db9c88b9c60bec9be04026f0a0798f2207e%5E%21

tdf#96892 vcl: add unit test for misplaced soft-hyphen ...

It will be available in 6.2.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.