PDF Export shows accented characters and umlauts as blank, when Type1-Fonts are used. The problem seems to be, that the PDF doesn't include the definition of an encoding vector. Adding "/Encoding /WinAnsiEncoding" to the Font-Object could be a quick fix at least for west european characters. The error is known in Open Office and documented here: http://www.openoffice.org/issues/show_bug.cgi?id=63015
[This is an automated message.] This bug was filed before the changes to Bugzilla on 2011-10-16. Thus it started right out as NEW without ever being explicitly confirmed. The bug is changed to state NEEDINFO for this reason. To move this bug from NEEDINFO back to NEW please check if the bug still persists with the 3.5.0 beta1 or beta2 prereleases. Details on how to test the 3.5.0 beta1 can be found at: http://wiki.documentfoundation.org/QA/BugHunting_Session_3.5.0.-1 more detail on this bulk operation: http://nabble.documentfoundation.org/RFC-Operation-Spamzilla-tp3607474p3607474.html
I tested again with Release 3.5.0. beta and can confirm that the bug has not been resolved. A document with Umlauts in an Adobe Type1 font looses the umlauts when exported to PDF. Cross-Checking by Printing with Adobe's PDFwriter device is working fine. The significant difference in the generated PDF files is that the sequence "/Encoding /WinAnsiEncoding" in the font object is missing in the exported PDF file.
It looks like the bug still exists in 4.1.2.3. Manually adding "/Encoding /WinAnsiEncoding" in the pdf file indeed is a workaround, but certainly no beginner-friendly one.
I found the solution to this. In vcl\source\gdi\pdfwriter_impl.cxx the Line 3494: if( !pFont->IsSymbolFont() && pEncoding == 0) must be changed to: if( !pFont->IsSymbolFont() ) Reason: Without the pEncoding check - "/Encoding/WinAnsiEncoding\n" is added to the pdf file font object which is correct. pEncoding specifies that a ToUnicode stream has to be generated (and it is) and nothing speaks against it because it is only a translation table and doesn't affect the encoding itself. For symbolic fonts WinAnsiEncoding would be wrong because they have there own encoding shipped with. I don't want to upload this myself because I don't intend to do more on libreoffice and it is to tiny to go through the git/gerrit upload process and making a patch for this. So please someone else do this, I don't want any rights on that code submission.
Looks like this has been remedied with LO 4.4.0.1 (which still is RC), while in 4.3.5 the bug is still present.
the bug is still present in 4.4.0.3
I confirm the bug is still present in 4.4.2 and at least the patch posted by edv isn´t applied neither in 4.4.3 source.
(In reply to edv from comment #4) > I found the solution to this. > In vcl\source\gdi\pdfwriter_impl.cxx the Line 3494: > if( !pFont->IsSymbolFont() && pEncoding == 0) > must be changed to: > if( !pFont->IsSymbolFont() ) > > Reason: Without the pEncoding check - "/Encoding/WinAnsiEncoding\n" is added > to the pdf file font object which is correct. pEncoding specifies that a > ToUnicode stream has to be generated (and it is) and nothing speaks against > it because it is only a translation table and doesn't affect the encoding > itself. For symbolic fonts WinAnsiEncoding would be wrong because they have > there own encoding shipped with. > > I don't want to upload this myself because I don't intend to do more on > libreoffice and it is to tiny to go through the git/gerrit upload process > and making a patch for this. So please someone else do this, I don't want > any rights on that code submission. Just for information, the patch had been pushed with this: http://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=eea16cb3e65a4308caddb7618d31a76ca259dbb1 but reverted with this: http://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=297b22bd49ea11a90063ab8503fb83090f351668 (see reasons in commit if interested)
*** Bug 87932 has been marked as a duplicate of this bug. ***
*** Bug 94061 has been marked as a duplicate of this bug. ***
Also experiencing this in LO 5.0.2.2 under Arch Linux. The bug suddenly appeared a few months ago, before that everything was working fine. However, it's also dependent on the PDF Viewer used. The one under Windows and Firefox's built-in one display the umlauts (though they look a bit off), but Evince just shows blank spaces.
(In reply to drunken monkey from comment #11) > Also experiencing this in LO 5.0.2.2 under Arch Linux. The bug suddenly > appeared a few months ago, before that everything was working fine. However, > it's also dependent on the PDF Viewer used. The one under Windows and > Firefox's built-in one display the umlauts (though they look a bit off), but > Evince just shows blank spaces. Somehow the new "gsfonts" triggered this (again): https://bugs.documentfoundation.org/show_bug.cgi?id=95221
However in the case I experience it has nothing to do with "WinAnsiEncoding" at all. So I will not mark my bug report as duplicate. See the attached PDFs (with an text editor or so) in my bug report for details.
This might be a possible fix, based on the comment in the revert commit. diff --git a/vcl/source/gdi/pdfwriter_impl.cxx b/vcl/source/gdi/pdfwriter_impl.cxx index 0d886e0..8755448 100644 --- a/vcl/source/gdi/pdfwriter_impl.cxx +++ b/vcl/source/gdi/pdfwriter_impl.cxx @@ -3529,7 +3529,7 @@ std::map< sal_Int32, sal_Int32 > PDFWriterImpl::emitEmbeddedFont( const Physical "<</Type/Font/Subtype/Type1/BaseFont/" ); appendName( aInfo.m_aPSName, aLine ); aLine.append( "\n" ); - if( !pFont->IsSymbolFont() && pEncoding == nullptr ) + if( !pFont->IsSymbolFont() && ( pEncoding == nullptr || pFont->GetCharSet() == RTL_TEXTENCODING_MS_1252 )) aLine.append( "/Encoding/WinAnsiEncoding\n" ); if( nToUnicodeStream ) { The mentioned revert commit: https://cgit.freedesktop.org/libreoffice/core/commit/?id=297b22bd49ea11a90063ab8503fb83090f351668 I am new here and I just stumbled upon this bug report because I need to create a PDF and it came out garbled the whole day :( Is this fix working and could you get this into a next build of libreoffice?
It would be nice if one of you could post a how-to or a script to add /Encoding /WinAnsiEncoding into a PDF. It would be a work-around and definitely less pain than having no solution at all available.
Gilbert: FYI, I proposed the patch here: https://gerrit.libreoffice.org/#/c/29792/1
Julien Nabet committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=52040395e3046ac42b8c3dd385c7b1cb26b929f3 tdf#34212: Accented Characters and Umlauts are missing with Type1 fonts It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 129183 [details] exported pdf I tested PDF export with Polish characters in text forms. I just installed newest LibreOfficeDev 5.3 on Ubuntu 14.04. Problem sill occurs.
(In reply to przekop from comment #18) > Created attachment 129183 [details] > exported pdf > > I tested PDF export with Polish characters in text forms. I just installed > newest LibreOfficeDev 5.3 on Ubuntu 14.04. > Problem sill occurs. Could you attach original document so we can try to reproduce this?
Created attachment 129184 [details] odt sample I try to fill the forms after export. Polish characters are missing in fillable text forms. Now I'm not sure is it right Bug thread, but closest to subject I could find.
Created attachment 129193 [details] export with master sources Here the result on pc Debian x86-64 with master sources updated today. It seems ok. Are you sure you retrieved a version including the patch http://cgit.freedesktop.org/libreoffice/core/commit/?id=52040395e3046ac42b8c3dd385c7b1cb26b929f3 from 14/11/2016? To be sure, could you provide BuildId (Help Menu/About)?
Try to copy any text with Polish characters (1ą 2ż 3ź 4ć 5ó 6ł) and paste in field form. Most of them disappear after leaving a field. 5.3.0.0.beta1 Build ID: 690f553ecb3efd19143acbf01f3af4e289e94536
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-5-2": http://cgit.freedesktop.org/libreoffice/core/commit/?id=b35798df2c1f6a05d8a3a28843c64c6da548f741&h=libreoffice-5-2 tdf#34212: Accented Characters and Umlauts are missing with Type1 fonts It will be available in 5.2.5. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 129378 [details] new export with filled fields On pc Debian x86-64 with master sources updated today, I could reproduce the problem with fields.
cleanup whiteboard since the bug is still there.
IMO this is more of a WONTFIX, given that the Type1 format is obsolete and is no longer accepted in 5.3.x.
We dropped support for Type 1 fonts already.