Bug 96080 - Incorrect spacing between words when opening this .pdf file
Summary: Incorrect spacing between words when opening this .pdf file
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
5.0.3.2 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:5.3.0
Keywords:
: 60159 (view as bug list)
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2015-11-26 08:53 UTC by Frederic Parrenin
Modified: 2016-09-16 04:44 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
.pdf file to reproduce the problem (91.72 KB, application/pdf)
2015-11-26 08:53 UTC, Frederic Parrenin
Details
The 3rd page of the pdf file as in appears in draw (228.93 KB, image/png)
2015-11-26 08:55 UTC, Frederic Parrenin
Details
Example of control character embedding (4.49 KB, application/pdf)
2016-06-18 15:52 UTC, vvort
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frederic Parrenin 2015-11-26 08:53:14 UTC
Created attachment 120806 [details]
.pdf file to reproduce the problem

Steps to reproduce:
- open the attached .pdf file in your favorite pdf reader
- open the same file in draw
=> observe the differences, in particular the incorrect spacing between words in draw.
Comment 1 Frederic Parrenin 2015-11-26 08:55:56 UTC
Created attachment 120807 [details]
The 3rd page of the pdf file as in appears in draw
Comment 2 vvort 2015-11-26 14:20:36 UTC
Spaces between last and first names in this pdf are encoded as 4 unicode characters:
U+0009: character tabulation
U+000D: carriage return
U+0020: space
U+00A0: no-break space
I don't know why all of them are gathered here in one place, but this is the source of the problems.
Comment 4 Heiko Tietze 2016-05-10 09:06:16 UTC
*** Bug 60159 has been marked as a duplicate of this bug. ***
Comment 5 vvort 2016-06-17 15:21:51 UTC
Tried to fix it here:
https://gerrit.libreoffice.org/#/c/26426/
Comment 6 vvort 2016-06-18 15:52:31 UTC
Created attachment 125720 [details]
Example of control character embedding
Comment 7 Commit Notification 2016-06-20 22:30:21 UTC
Vort committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b65f46127a9a1042edd4198f4a44820d7ea357a6

tdf#96080 PDF Import: fix incorrect whitespace characters sequence

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Xisco Faulí 2016-09-15 22:41:28 UTC
Hello,
Is this bug fixed?
If so, could you please close it as RESOLVED FIXED?