Created attachment 56825 [details] The simplest pdf that produces the problem Problem description: Text in a pdf is imported twice in the latest git version of libreoffice 3.5. LibreOffice 3.4.4 does not have this problem so this is a regression. Steps to reproduce: 1. Open attached pdf (test.pdf) with libreoffice draw (libreoffice-3-5). Current behavior: The pdf shows string "00". Expected behavior: The pdf should contain string "0". Platform (if different from the browser): Browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0
Reproduced with LibreOffice 3.5.0rc3 7e68ba2-a744ebf-1f241b7-c506db1-7d53735 Ubuntu 10.04.3 x86 Linux 2.6.32-38-generic Russian UI
This seems to happen with all strings that have only one character. Strings with two or more characters import normally.
[REPRODUCIBLE] on 3.5.0 Beta 2 on Windows XP => change the version field, because it should be the 'earliest' version which the problem was found. If someone reproduce it with the lower version, please feel free to change it. See http://wiki.documentfoundation.org/BugReport_Details#Version Also set platform to All. (In reply to comment #2) > This seems to happen with all strings that have only one character. Strings > with two or more characters import normally. Maybe yes. I test with "1" and "3" and it's also imported twice. But I didn't test with a character yet, nor test with strings with > 1 characters.
bibisect-ing shows that: Since source-hash-59cb0469897b1d2c57386510ad321a72e5477ad4 and *newer*, REPRODUCIBLE But since source-hash-a0a1c3f4fb730ed3614593c3d8ddb50c23204c29 and *older*, I can't open pdf. It shows "ASCII Filter Options" dailog, and if I click it, it opens Writer instead of Draw.
Well, found the commit: http://cgit.freedesktop.org/libreoffice/core/commit/?id=29db940ce504a5dff393927e4ea2680156f2b119 This commit enables pdf import extension by default, so before this it's unable to import pdf because the extension doesn't built. Checked in bibisect's autogen.log IIUC, now the bug is in the extension ...
All Markus' and Korrawit's observations reproducible. I heavily suffer from that bug. This does not happen with all PDFs for me, generally all normal Text documents will be imported without problem, but PDF exports from CAD programs are totally crippled by this problem after PDF import, completely unusable. Older Master versions like Server installation of Master "LibO-dev 3.5.0 – WIN7 Home Premium (64bit) English UI [(Build ID: 5d1a991-4cb1bac-ca7e6f5-9125509-ce71330)]" (2011-11-09) suffering from "Bug 44710 - FILEOPEN PDF: Rotated texts at wrong position and scrambled" (LibO 3.3.0?) also showed that duplicated characters, see Bug 44710#c3. The question is what strings exactly are affected. A common mark of texts in those CAD drawings is that character widths and similar are not normal. But I can't reproduce that problem with an own sample (Type an "O" into a textbox in a sample.odg, export to sample.pdf, open sample.pdf with DRAW works fine). @Markus Ilmola: What's the special thing with your character in your sample that causes the problem?
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=bcb4defef7c9147a94ef19a51a18715449d3572d Fix fdo#45848
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "libreoffice-3-5": http://cgit.freedesktop.org/libreoffice/core/commit/?id=fc0c85e8628bf90afd4a47c20b3d1bc2a9c01b36&g=libreoffice-3-5 Fix fdo#45848 It will be available in LibreOffice 3.5.4.
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "libreoffice-3-5-3": http://cgit.freedesktop.org/libreoffice/core/commit/?id=5a39623867709b271db738ba259817eb5d6f1674&g=libreoffice-3-5-3 Fix fdo#45848 It will be available already in LibreOffice 3.5.3.
Verified the fix in LibO 3.5.3 official Windows XP. Thanks for the bug report.
Closing