Bug 45848 - FILEOPEN particular PDF: Characters duplicated
Summary: FILEOPEN particular PDF: Characters duplicated
Status: CLOSED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Draw (show other bugs)
Version:
(earliest affected)
3.5.0 Beta2
Hardware: All All
: medium major
Assignee: Korrawit Pruegsanusak
URL:
Whiteboard: BSA bibisected35 target:3.6.0 target:...
Keywords: regression
Depends on:
Blocks:
 
Reported: 2012-02-09 09:28 UTC by Markus Ilmola
Modified: 2012-05-11 21:29 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
The simplest pdf that produces the problem (1.02 KB, application/pdf)
2012-02-09 09:28 UTC, Markus Ilmola
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Ilmola 2012-02-09 09:28:58 UTC
Created attachment 56825 [details]
The simplest pdf that produces the problem

Problem description: Text in a pdf is imported twice in the latest git version of libreoffice 3.5. LibreOffice 3.4.4 does not have this problem so this is a regression.

Steps to reproduce:
1. Open attached pdf (test.pdf) with libreoffice draw (libreoffice-3-5).

Current behavior: The pdf shows string "00".

Expected behavior: The pdf should contain string "0".

Platform (if different from the browser): 
              
Browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0
Comment 1 tester8 2012-02-14 02:18:04 UTC
Reproduced with

LibreOffice 3.5.0rc3
7e68ba2-a744ebf-1f241b7-c506db1-7d53735
Ubuntu 10.04.3 x86
Linux 2.6.32-38-generic Russian UI
Comment 2 Markus Ilmola 2012-02-21 12:56:36 UTC
This seems to happen with all strings that have only one character. Strings with two or more characters import normally.
Comment 3 Korrawit Pruegsanusak 2012-03-04 06:16:45 UTC
[REPRODUCIBLE] on 3.5.0 Beta 2 on Windows XP => change the version field, because it should be the 'earliest' version which the problem was found.

If someone reproduce it with the lower version, please feel free to change it.
See http://wiki.documentfoundation.org/BugReport_Details#Version

Also set platform to All.

(In reply to comment #2)
> This seems to happen with all strings that have only one character. Strings
> with two or more characters import normally.

Maybe yes. I test with "1" and "3" and it's also imported twice. But I didn't test with a character yet, nor test with strings with > 1 characters.
Comment 4 Korrawit Pruegsanusak 2012-03-04 06:26:29 UTC
bibisect-ing shows that:

Since source-hash-59cb0469897b1d2c57386510ad321a72e5477ad4 and *newer*, REPRODUCIBLE

But since source-hash-a0a1c3f4fb730ed3614593c3d8ddb50c23204c29 and *older*, I can't open pdf. It shows "ASCII Filter Options" dailog, and if I click it, it opens Writer instead of Draw.
Comment 5 Korrawit Pruegsanusak 2012-03-04 06:38:05 UTC
Well, found the commit: http://cgit.freedesktop.org/libreoffice/core/commit/?id=29db940ce504a5dff393927e4ea2680156f2b119

This commit enables pdf import extension by default, so before this it's unable to import pdf because the extension doesn't built. Checked in bibisect's autogen.log

IIUC, now the bug is in the extension ...
Comment 6 Rainer Bielefeld Retired 2012-04-01 22:53:40 UTC
All Markus' and  Korrawit's observations reproducible. I heavily suffer from that bug.

This does not happen with all PDFs for me, generally all normal Text documents will be imported without problem, but PDF exports from CAD programs are totally crippled by this problem after PDF import, completely unusable. 

Older Master versions like Server installation of Master "LibO-dev 3.5.0 – WIN7 Home Premium (64bit) English UI [(Build ID:  5d1a991-4cb1bac-ca7e6f5-9125509-ce71330)]" (2011-11-09) suffering from  "Bug 44710 - FILEOPEN PDF: Rotated texts at wrong position and scrambled" (LibO 3.3.0?) also showed that duplicated characters, see Bug 44710#c3.

The question is what strings exactly are affected. A common mark of texts in those CAD drawings is that character widths and similar are not normal. But I can't reproduce that problem with an own sample (Type an "O" into a textbox in a sample.odg, export to sample.pdf, open sample.pdf with DRAW works fine).

@Markus Ilmola:
What's the special thing with your character in your sample that causes the problem?
Comment 7 Not Assigned 2012-04-21 12:54:47 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=bcb4defef7c9147a94ef19a51a18715449d3572d

Fix fdo#45848
Comment 8 Not Assigned 2012-04-21 13:07:15 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "libreoffice-3-5":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=fc0c85e8628bf90afd4a47c20b3d1bc2a9c01b36&g=libreoffice-3-5

Fix fdo#45848


It will be available in LibreOffice 3.5.4.
Comment 9 Not Assigned 2012-04-23 03:09:48 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "libreoffice-3-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=5a39623867709b271db738ba259817eb5d6f1674&g=libreoffice-3-5-3

Fix fdo#45848


It will be available already in LibreOffice 3.5.3.
Comment 10 Korrawit Pruegsanusak 2012-05-11 21:29:18 UTC
Verified the fix in LibO 3.5.3 official Windows XP. Thanks for the bug report.
Comment 11 Korrawit Pruegsanusak 2012-05-11 21:29:48 UTC
Closing