Created attachment 138893 [details] 2016CommunicationRallyjobapplication.pdf Open the attached PDF in Draw (downloaded from [1]). => Text in the note and several table cells don't fit and overlap other cells. Checkboxes are also rendered incorrectly, but that's independent from this bug. Observed using LO 6.1 master build (a0e136d2cbb3784ddfcbddcfed5d784c8e4c9a64) & 5.3.0.3 / Ubuntu 17.04. PDF is rendered fine in 5.2.0.4. => regression PDF also looks fine in 6.0.0.1 / Windows 7. => Linux only The bug starts with the following commit (with SAL_USE_COMMON_LAYOUT environment variable set): https://cgit.freedesktop.org/libreoffice/core/commit/?id=828b8cf4d26c4d72c1f2146fd7a5bbb3b0465718 author Akash Jain <akash96j@gmail.com> 2016-07-06 10:35:24 +0530 committer Khaled Hosny <khaledhosny@eglug.org> 2016-10-18 20:41:29 +0200 "GSoC: Integrate new CommonSalLayout in unx/ code" [1] http://www.oces.tulsacounty.org/4h/4hForms/2016CommunicationRallyjobapplication.pdf
Created attachment 138894 [details] Screenshot
Your screen shot shows font substitutions are being made for several of the PalatinoLinoType fonts subsetted into the PDF. Similar for me on Windows builds. So, IMHO this is correct behavior (if referenced font is not installed on system) but then fallback handling of font metrics is not ideal. Unlike other PDF "viewers", for our purposes of extracting PDF content, on filter import of a PDF the font substitution has to be made--as we likely will need to change text to use glyphs that are not included with the available subset in the PDF. And the trouble comes with the fallback mechanism, which is passed off to the OS to deal with. I don't know if the PDF font embedding includes all the metrics--anyone? But not sure we can improve that if the font metrics are not available. So, not really an issue with HarfBuzz?
I opened the attachment and the only problem i have is; checkboxes are rendered incorrectly. Version: 5.3.8.0.0+ Build ID: 7f1297d9b4f449eb9ada8008fb21b7046d1a8f19 CPU Threads: 8; OS Version: Linux 4.14; UI Render: default; VCL: kde4; Layout Engine: new; TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:libreoffice-5-3, Time: 2017-11-10_15:56:34 Locale: nl-BE (en_US.UTF-8); Calc: group Version: 6.1.0.0.alpha0+ Build ID: 4ead201c578ce4cc17f65d2a97a591e112307a1a CPU threads: 8; OS: Linux 4.14; UI render: default; VCL: kde4; TinderBox: Linux-rpm_deb-x86_64@70-TDF, Branch:master, Time: 2017-12-31_00:43:41 Locale: nl-BE (en_US.UTF-8); Calc: group threaded
(In reply to V Stuart Foote from comment #2) > Your screen shot shows font substitutions are being made for several of the > PalatinoLinoType fonts subsetted into the PDF. Similar for me on Windows > builds. Right, I should've paid closer attention to the details, the fonts look obviously different. Let me attach a comparison screenshot between 5.2.0.4 and 6.1 master build. > Unlike other PDF "viewers", for our purposes of extracting PDF content, on > filter import of a PDF the font substitution has to be made--as we likely > will need to change text to use glyphs that are not included with the > available subset in the PDF. This sounds logical, however it doesn't explain two things: - why did the font substitution change between 5.2 and 5.3, and what does it have to do with the common layout change? - why is it fine in Windows? I don't have the font there either, and it pretty much looks the same as the pre-5.3 version in Linux.
Created attachment 138932 [details] Comparison screenshot (5.2.0.4 vs 6.1 build, Linux)
What are the fonts used in the document in each version? Probably the pre-HarfBuzz version is using a Type 1 font that has closer metrics to the font used in the PDF (which wouldn't be an issue on Windows as it doesn't usually come with Type 1 fonts).
(In reply to Khaled Hosny from comment #6) > What are the fonts used in the document in each version? Looks like reading the font and its metrics happens in the import filter [1], but then gets handled for fallback elsewhere. Ironic that only way to tell now what font gets used is to export from Draw to PDF and compare. We'd need something like bug 61134 or bug 78186 to help here. =-ref-= [1] https://opengrok.libreoffice.org/xref/core/sdext/source/pdfimport/wrapper/wrapper.cxx?#588
As far as I can see the good versions (pre-5.3 Linux versions and pre/post-5.3 Windows versions) actually use Palatino Linotype, and post-5.3 Linux versions use Bitstream Vera Serif.
(In reply to Khaled Hosny from comment #6) > What are the fonts used in the document in each version? Probably the > pre-HarfBuzz version is using a Type 1 font that has closer metrics to the > font used in the PDF (which wouldn't be an issue on Windows as it doesn't > usually come with Type 1 fonts). That would be my guess as well; Ubuntu includes URW Palladio L as a Palatino substitute, but that is a Type 1 font, which would explain its disappearance from newer LibreOffice versions.
(In reply to Adolfo Jayme from comment #9) > That would be my guess as well; Ubuntu includes URW Palladio L as a Palatino > substitute, but that is a Type 1 font, which would explain its disappearance > from newer LibreOffice versions. Ah, I didn't know there were clones of this font. Not only that, there are actually new FOSS versions as well, FPL Neu and TeX Gyre Pagella. Eg. in Ubuntu TeX Gyre Pagella can be installed separately through the 'tex-gyre' package, which fixes font substitution in the attached PDF. A general question about font substitution, could the handling of metrics be improved when the font is missing, to avoid badly looking output as it is in attachment 138932 [details]? I don't know what kind of information can be collected from the PDF.
(In reply to Aron Budea from comment #10) > A general question about font substitution, could the handling of metrics be > improved when the font is missing. No. If the font is missing, we have no idea what its metrics were, this kind of information is not embedded in Office documents. Closing this as not a bug, if one does not have the exact font, all bets are off and the end result is system-dependent.