Bug 47668 - No unicode characters in pdf form fields in Evince and some readers in Linux
Summary: No unicode characters in pdf form fields in Evince and some readers in Linux
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Printing and PDF export (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Form-Controls
  Show dependency treegraph
 
Reported: 2012-03-21 11:02 UTC by Vladimir
Modified: 2022-02-20 16:09 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
pdf form made in LibreOffice (13.98 KB, application/pdf)
2012-03-21 11:02 UTC, Vladimir
Details
source file for pdf form (10.60 KB, application/vnd.oasis.opendocument.text)
2012-03-21 11:02 UTC, Vladimir
Details
pdf form export 4.0.3 (10.67 KB, application/pdf)
2013-06-25 12:37 UTC, Vladimir
Details
source for pdf form in 4.0.3 (10.69 KB, application/vnd.oasis.opendocument.text)
2013-06-25 12:38 UTC, Vladimir
Details
Pdf form generated from attachment 58825 with 4.1.5.3 (17.65 KB, application/pdf)
2014-10-15 12:44 UTC, JC Cardot
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vladimir 2012-03-21 11:02:24 UTC
Created attachment 58824 [details]
pdf form made in LibreOffice

PDF forms created with LibreOffice show only latin characters in form fields.
Example files in the attachments.

PDF forms from some other sources work correctly, i.e. this: http://www.fms.gov.ru/documents/passport/pdf/anketa_new_14u.pdf

Viewed in evince, epdfview.
Comment 1 Vladimir 2012-03-21 11:02:58 UTC
Created attachment 58825 [details]
source file for pdf form
Comment 2 s-joyemusequna 2012-03-22 12:06:26 UTC
It works for me. Tested with LibO 3.4.5 on Windows Vista and with LibO 3.5.2 RC1 on Windows XP. PDF Viewer Acrobat Reader X.

If I save the "source file for pdf form" I have russian characters in the form. I can copy russian text from the clipboard into the form (I have no russian keyboard).
Comment 3 Vladimir 2012-03-22 12:17:59 UTC
(In reply to comment #2)

Do you see cyrillic characters rendered? (when focus is not in the field)
Comment 4 s-joyemusequna 2012-03-22 12:22:52 UTC
Yes.
Comment 5 Vladimir 2012-03-22 12:49:27 UTC
(In reply to comment #4)
> Yes.

Could you please see if windows version of evince shows it?


Linux versions of evince and epdfview do render cyrillic characters in the form you can get from the link supplied in description. Therefore they are capable of rendering non-latin characters, but fail on LO-created forms.
Comment 6 Vladimir 2012-03-22 13:03:09 UTC
Evince lists two fonts in LO-generated form:
Liberation Sans (embedded subset)
Liberation Sans (not embedded) < what does that mean?

If I delete form field (leave only text object on the left) and export test document as pdf form, evince will list only:
Liberation Sans (embedded subset)
Comment 7 s-joyemusequna 2012-03-23 04:02:33 UTC
evince-2.32.0.145 (Windows XP) : LibO 3.5.2 RC1 - doesn't work properly:

it displays no russian characters, only the latin "abcdefgh" (exactly like that, without the quotation marks). But when when I click on the field, the latin characters disappear and the expected russian characters (the ones displayed on the left side) will be displayed. Tested also with LibO 3.4.5 on Windows XP - the same error.

So it seems to be an evince bug.

Note: The font "Liberation Sans" is present on my PC (russian characters included), so no substitution is necessary.

Embedded subset: fonts, which are embedded in the PDF file, so they can be displayed if they are not installed on your PC. It seems that the russian fonts (on the left side) are embedded, and the fonts in the edit box are not embedded as you must be able to fill them in (i.e. they have to be on the PC).
Comment 8 Vladimir 2012-03-23 04:36:06 UTC
(In reply to comment #7)

It is an evince bug too (failure to use system font in forms), but...

If you download pdf file from fms.gov.ru (see description), you will find that evince can display non-latin characters in the fields.
That form uses Times New Roman Cyr font which I do not have on my system. But evince shows it in fields, because it is embedded in the form.

This leads to two questions:
Why LO applies different font sets for text and fields when both use the same Liberation Sans?
Why LO embed font for content and does not embed font for fields?
Comment 9 sasha.libreoffice 2012-06-14 04:20:06 UTC
Another pdf font problem: Bug 50879
Comment 10 Thomas Hackert 2013-06-24 16:27:32 UTC
Hello Vladimir, *,
would you be so kind to explain us, how you created the PDF file? If I use your source file, go to "File - Export to PDF..." with "Create PDF Form" enabled on the right sight below "General" in the "PDF Options" window, and just click on "Export", I am able to see the Cyrillic characters in Okular Version 0.14.3
from KDE 4.8.4 under Debian Testing AMD64, after I have clicked on "Show formulars" there. Did you use a different way to find your bug?

I have tested it with LO Version: 4.1.0.1 Build ID: 1b3956717a60d6ac35b133d7b0a0f5eb55e9155 under Debian Testing AMD64, so I would ask you to test it with a newer version of LO than 3.5.1, if something has changed/improved. It would be nice, if you could change the status of the bug, if it is solved ... ;)

And did you you use any other pdf viewer to test the exported PDF or only  evince? If it is only an evince problem (like indicated in comment 8), I think, it is not our bug, but an evince one ... ;)
Sorry for the inconvenience
Thomas.
Comment 11 خالد حسني 2013-06-25 11:50:10 UTC
I can reproduce this on master with Evince but not Adobe Reader. Evince writes to terminal warnings like "warning: layoutText: cannot convert U+0430". The same happens with the fms.gov.ru PDF, and Adobe Reader can’t even open that file. I’m inclined to think it is an evince issue.
Comment 12 Vladimir 2013-06-25 12:36:51 UTC
I used "Export as PDF" function in File menu.
I've retested it with version 4.0.3, viewed with Evince and Adobe Reader. 
There are no Cyrillic symbols in form in Evince. 
And funny result with Adobe Reader: it shows only Cyrillic characters, everything else is like password dots. It also gives an error about absence of Liberation Sans font in the OS, although font embedding was enabled during export.
Comment 13 Vladimir 2013-06-25 12:37:52 UTC
Created attachment 81409 [details]
pdf form export 4.0.3
Comment 14 Vladimir 2013-06-25 12:38:28 UTC
Created attachment 81410 [details]
source for pdf form in 4.0.3
Comment 15 sasha.libreoffice 2013-06-26 06:40:42 UTC
Thanks for additional information
Reproduced using 4-th attachment test.odt
In dialog "PDF Options" was disabled "PDF/A-1a" and enabled "Create PDF form".
When opening produced PDF in Okular it asks "Show forms?". If we click on this button then Cyrillic characters shown, if not then not.
In Evince are no such option and  Cyrillic characters not shown.

It looks like problem is in creating form-mode PDF. Ordinary PDFs working correctly.
Comment 16 Vladimir 2013-06-26 07:02:48 UTC
well, yes. Test file contains simple text field and form text field. Both contain the same string. Simple text field is shown correctly, form field is not.
Comment 17 JC Cardot 2014-10-15 12:44:41 UTC
Created attachment 107869 [details]
Pdf form generated from attachment 58825 [details] with 4.1.5.3

I just generated a pdf form using the source in attachment 58825 [details] and the resulting pdf form seems correct. No more "password dots", no more complaints about the missing font. The Cyrillic characters appear in the field.
Comment 18 Vladimir 2014-10-15 12:57:45 UTC
I've checked your pdf in evince, qpdfview and also xournal. Cyrillic characters are not shown in any of these applications.
Comment 19 QA Administrators 2015-12-20 16:10:35 UTC Comment hidden (obsolete)
Comment 20 Vladimir 2015-12-20 19:52:01 UTC
Just created a form in LO 5.0.4~rc2 (current version in Debian Testing)
Evince and Qpdfview failed to display Cyrillic characters in form fields.
Comment 21 QA Administrators 2017-01-03 19:57:26 UTC Comment hidden (obsolete)
Comment 22 Murz 2017-02-02 08:06:35 UTC
I retest this bug - it is still here on LibreOffice 5.2.5.1 - when I exporting file to PDF with forms - form values displays only Latin letters, Cyrillic and other non-latin letters are missed and displays as empty space.
Comment 23 QA Administrators 2018-02-03 03:32:31 UTC Comment hidden (obsolete)
Comment 24 Thomas Lendo 2018-10-10 20:51:59 UTC
Still reproducible.

Version: 6.2.0.0.alpha0+
Build-ID: 425af6845ebe066c950b0b63f50563e067485f3e
CPU-Threads: 4; BS: Linux 4.15; UI-Render: Standard; VCL: gtk3; 
Gebietsschema: de-DE (de_DE.UTF-8); Calc: threaded
Comment 25 QA Administrators 2019-10-11 02:37:50 UTC Comment hidden (obsolete)
Comment 26 Timur 2020-02-18 16:45:41 UTC
No repro 6.2.8 and 7.0+ in Windows with Adobe and PDF-Xchange.
I amend the title to say it's just Evince and some readers in Linux.
Would be nice to retest with master LO 7.0+.
Comment 27 QA Administrators 2022-02-18 03:42:19 UTC Comment hidden (obsolete)
Comment 28 me 2022-02-20 16:08:25 UTC
I confirm this bug. I downloaded attachment 58825 [details], “Export as PDF…” in LibreOffice, opened the resulting PDF file in Evince. Cyrillic characters are not visible in the form field. If I click on the form field to edit the text, they are visible.