Bug Hunting Session
Bug 112950 - PDF: Hebrew characters overlapping or very close together with David CLM font
Summary: PDF: Hebrew characters overlapping or very close together with David CLM font
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
5.4.2.2 release
Hardware: All Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisectRequest, regression
Depends on:
Blocks: PDF-Export RTL-Hebrew
  Show dependency treegraph
 
Reported: 2017-10-06 19:47 UTC by Eyal Rozenberg
Modified: 2018-01-08 00:58 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
ODT document whose PDF export gets messed up (17.60 KB, application/vnd.oasis.opendocument.text)
2017-10-06 19:47 UTC, Eyal Rozenberg
Details
PDF export result (21.57 KB, application/pdf)
2017-10-06 19:47 UTC, Eyal Rozenberg
Details
3-page odt test doc (17.72 KB, application/vnd.oasis.opendocument.text)
2017-10-12 21:49 UTC, Yousuf Philips (jay) (retired)
Details
5.3 PDF vs 5.4 PDF (52.32 KB, image/png)
2017-10-12 21:57 UTC, Yousuf Philips (jay) (retired)
Details
close comparison 5.3 vs 5.4 (15.92 KB, image/png)
2017-10-13 20:47 UTC, Maxim Iorsh
Details
close comparison 5.3 vs 5.4 (16.04 KB, image/png)
2017-10-13 20:54 UTC, Maxim Iorsh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2017-10-06 19:47:25 UTC
Created attachment 136814 [details]
ODT document whose PDF export gets messed up

I'm using LO writer 5.4.2.2 release on Linux Mint 18.2.

I have an ODT document which has (probably) been edited by MS Word at some point. When I export it to PDF, some of the Hebrew letters overlap each other, some don't. On the second page, after having pressed the numbered list toolbar button, the PDF export renders the text almost properly, but not quite - sapcing is off. Specifically, notice the lack of distance between the rightmost Heh (ה) and Ain (ע) on the top line on page 2.

Finally, when I remove the end of the paragraph and keep most of the first line - on page 3 - the rendering looks just fine.

I suspect this is not a recent regression, since I recall having experienced this with previous versions - but now I got annoyed enough to spend the time creating a small manifesting example and reporting it.
Comment 1 Eyal Rozenberg 2017-10-06 19:47:52 UTC
Created attachment 136815 [details]
PDF export result
Comment 2 Eyal Rozenberg 2017-10-06 19:48:54 UTC
Note you'll need the David CLM font. You can get it here:

https://sourceforge.net/projects/culmus/files/culmus/0.131/culmus-0.131.tar.gz/download
Comment 3 Xisco Faulí 2017-10-07 08:10:00 UTC
You can't confirm your own bugs. Moving it back to UNCONFIRMED until someone else confirms it.
Comment 4 Eyal Rozenberg 2017-10-07 17:07:27 UTC
On this thread:
https://whatsup.org.il/index.php?name=PNphpBB2&file=viewtopic&p=421925#421925

Several people are confirming it while others do not see it happening with their versions. I'm hoping some of them would come over here to confirm...
Comment 5 Lior Kaplan 2017-10-12 10:16:32 UTC
Confirmed with LibreOffice 5.4.1 on Debian 64bit.

Seems to happen only with Culmus fonts. CCing Maxim Iorsh for his opinion on this (as their creator).
Comment 6 Yousuf Philips (jay) (retired) 2017-10-12 21:49:45 UTC
Created attachment 136938 [details]
3-page odt test doc

As attachment 136814 [details] only has 2 pages and the exported pdf in attachment 136815 [details] has 3, i created this 3 page test document.
Comment 7 Yousuf Philips (jay) (retired) 2017-10-12 21:56:45 UTC
The overlapping characters is clearly a regression introduce in the 5.4 cycle. The close Heh and Ain characters varied in 5.3 and 5.4 based on whether it is in a numbered list or not, 5.3 but in my tests never was as close as in attachment 136815 [details]. Tested on Linux Mint 18.0.

Version: 5.3.7.0.0+
Build ID: a562be54f3127f4e22a3a38e62db2b38d48499f3
CPU Threads: 2; OS Version: Linux 4.4; UI Render: default; VCL: gtk2; Layout Engine: new; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-3, Time: 2017-09-19_03:52:04
Locale: en-US (en_US.UTF-8); Calc: group

Version: 5.4.3.0.0+
Build ID: fb64cf127dc6398f5d18d186a93966837db0bb1e
CPU threads: 2; OS: Linux 4.4; UI render: default; VCL: gtk2; 
TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-4, Time: 2017-09-27_12:54:32
Locale: en-US (en_US.UTF-8); Calc: group
Comment 8 Yousuf Philips (jay) (retired) 2017-10-12 21:57:25 UTC
Created attachment 136939 [details]
5.3 PDF vs 5.4 PDF
Comment 9 Eyal Rozenberg 2017-10-13 16:54:14 UTC
Also note that some letters are spaced slightly too far apart. I wonder if there isn't some kind of "index-is-off" issue here in accessing the amount of horizontal space necessary per character.
Comment 10 Maxim Iorsh 2017-10-13 20:47:56 UTC
Created attachment 136964 [details]
close comparison 5.3 vs 5.4

Looking at the PDF comparison, it looks like certain groups of letters are shifted right, leaving their surrounding intact - see attached "5.3 vs 5.4 close comparison" image (made from jay's PNG)

The shifted groups (first line only) are
 * ayn-resh
 * lamed-yod
 * nun-vav-gimel-ayn
 * lamed-alef-vav-pe
 * he-lamed-vav
 * samech-pe-yod
 * het-shin-bet-vav-nun-vav
 * bet-alef
 * alef-het

Honestly, I can't see any logic in this collection.
Comment 11 Maxim Iorsh 2017-10-13 20:54:32 UTC
Created attachment 136965 [details]
close comparison 5.3 vs 5.4

Shifted groups underlined with blue
Comment 12 zdevir 2017-10-14 12:57:51 UTC
Confirmed both Linux (latest 5.4 RC) and Windows (5.4.1.2). Problem occurs with all fonts, including David and Narkisim. I guess it has something to do with the internal representation of the text.
Comment 13 Eyal Rozenberg 2017-10-14 14:04:33 UTC
(In reply to zdevir from comment #12)
> Problem occurs with all fonts, including David and Narkisim.

All Culmus fonts, you mean? Or have you seen this with other Hebrew fonts?
Comment 14 Omer Zak 2017-11-20 20:03:13 UTC
The problem was not reproduced in:

Version: 6.0.0.0.alpha1+
Build ID: 9050854c35c389466923f0224a36572d36cd471a
CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.utf8); Calc: group

OS: Debian 64bit Stretch (Debian 9.2, with some backported packages)

The reported font for the document's text was David CLM.
Comment 15 Xisco Faulí 2017-11-20 20:06:28 UTC
(In reply to Omer Zak from comment #14)
> The problem was not reproduced in:
> 
> Version: 6.0.0.0.alpha1+
> Build ID: 9050854c35c389466923f0224a36572d36cd471a
> CPU threads: 8; OS: Linux 4.9; UI render: default; VCL: gtk3; 
> Locale: en-US (en_US.utf8); Calc: group
> 
> OS: Debian 64bit Stretch (Debian 9.2, with some backported packages)
> 
> The reported font for the document's text was David CLM.

Duplicate of bug 113428?
@Yousuf, what do you think ?
Comment 16 Yousuf Philips (jay) (retired) 2017-11-21 09:53:19 UTC
(In reply to Xisco Faulí from comment #15)
> Duplicate of bug 113428?
> @Yousuf, what do you think ?

Khalid fixed that bug on the 8th and if this bug is still showing up with Omer's build from the 13th then it wouldnt be a duplicate.

This bug needs to be bibisected to know where the issue first arose.
Comment 17 Eyal Rozenberg 2017-11-21 14:50:09 UTC
(In reply to Yousuf Philips (jay) from comment #16)
> (In reply to Xisco Faulí from comment #15)
> > Duplicate of bug 113428?

Doesn't look like a proper dupe. In  bug 113428 characters fully overlap, or worse; with this bug, it's more of a spacing issue.

Also, 113428 does not manifest with 5.4.

Of course... it's not impossible that fixing that one somehow affected this one but I really can't say. And they're both about horizontal placement of glyphs on a line, so they're obviously related.
Comment 18 Xisco Faulí 2017-11-24 23:40:49 UTC
My question after reading comment 14 is, is this issue fixed on master? then we can close it as RESOLVED WORKSFORME...
Comment 19 Eyal Rozenberg 2018-01-07 13:12:58 UTC
Let's just say it was fixed somehow.