Bug 32249 - When importing PDF with text in it , it will be better to have a easy and fluent option to edit the imported Text
Summary: When importing PDF with text in it , it will be better to have a easy and flu...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
3.3.0 RC1
Hardware: All All
: medium enhancement
Assignee: Not Assigned
QA Contact:
URL:
Whiteboard:
Keywords:
: 38084 84712 91896 93039 105274 (view as bug list)
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
 
Reported: 2010-12-08 22:51 UTC by grigoreflorin1985
Modified: 2017-01-13 14:05 UTC (History)
13 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description grigoreflorin1985 2010-12-08 22:51:29 UTC
When i import a PDF with text in it I get editing function on separate paragraph one by one , what i need and want (like all users will expect to do)  to do it is to edit all paragraf like I do an office document normaly. Option to union the the paragraph to edit them flawless and easy at start ? Unificate all paragraph on page to be editable like a simple full form. It is time consuming to click and edit every paragrapf one at a time. Tryed union from the right click menu  on them and I get a plain graphic non editable txt form with the txt tool (aka big T icon).

Hope this small option will get until final release.
Comment 1 Rainer Bielefeld Retired 2010-12-09 08:58:10 UTC
I also sometimes wished such a feature, but I'm afraid that would cost too much manpower. Compared to other needs definitively not more than Importance "Medium", I doubt that that ever will be integrated.
Comment 2 Samuele Kaplun 2011-05-06 07:30:12 UTC
Hi,

I am a developer on a digital library software, and, aiming at supporting digital preservation, I was thinking of exploiting the wonderful PDF importer filter of LibreOffice to archive .odt document next to the original .pdf (as the .odt document should provide more value for future retrieval and the use of the document).

Indeed I also find this a very nice feature to have and it should be possible to implement it via some heuristic such as merging together subsequent lines that are not too far from each other (e.g. say that they are not more distant than the height of the character).

If no-one have time to work on it I'd be glad to give it a try in my spare time, if someone could be so kind to point me at the most appropriate source code files that would need to be touched.

Cheers!
Comment 3 Rainer Bielefeld Retired 2011-05-06 08:17:14 UTC
@Samuele Kaplun:
That would be great. 

I'm afraid that won't be easy. I do not know how that works for other OS, but for WIN I have to install the "Oracle PDF Import Extension" from 
<http://extensions.services.openoffice.org/en/search/node/pdf import>, what itself afaik uses XPDF <http://foolabs.com/xpdf/about.html> as text extractor.

That's all I can contribute.

BTW: Version is for the first version where the problem has been observed!
Comment 4 Samuele Kaplun 2011-05-06 08:34:08 UTC
(In reply to comment #3)
> @Samuele Kaplun:
> I'm afraid that won't be easy. I do not know how that works for other OS, but
> for WIN I have to install the "Oracle PDF Import Extension" from 
> <http://extensions.services.openoffice.org/en/search/node/pdf import>, what
> itself afaik uses XPDF <http://foolabs.com/xpdf/about.html> as text extractor.

From <http://www.libreoffice.org/features/extensions/> I understand that finally this extension is part of the core LibreOffice source tree. Is this so?

> That's all I can contribute.
> 
> BTW: Version is for the first version where the problem has been observed!

Sorry for this!! That makes perfect sense!
Comment 5 Rainer Bielefeld Retired 2011-05-06 10:50:21 UTC
> From <http://www.libreoffice.org/features/extensions/> I understand that
> finally this extension is part of the core LibreOffice source tree. Is this so?

I thought so, too, but for my 3.4 I definitively had to download the extension. Pls see 
<https://bugs.freedesktop.org/show_bug.cgi?id=35604#c6>
Comment 6 Björn Michaelsen 2011-12-23 11:33:26 UTC Comment hidden (obsolete)
Comment 7 Rainer Bielefeld Retired 2011-12-23 23:33:18 UTC
Was New by good reasons. But it#s the question whether there is a realistic chance to get this enhancement.
Comment 8 vilpan 2013-05-01 18:41:14 UTC
*** Bug 38084 has been marked as a duplicate of this bug. ***
Comment 9 sophie 2014-10-09 11:21:36 UTC
*** Bug 84712 has been marked as a duplicate of this bug. ***
Comment 10 QA Administrators 2014-10-23 17:31:40 UTC Comment hidden (obsolete)
Comment 11 Gerry 2015-04-23 16:57:59 UTC
I can confirm this bug in LO 4.4.2.2. on Windows 7
Comment 12 Jean-Baptiste Faure 2015-07-01 18:06:35 UTC
*** Bug 91896 has been marked as a duplicate of this bug. ***
Comment 13 Hendrik Maryns 2015-11-22 08:38:23 UTC
Is there no bug voting?  This is a major turndown!
Comment 14 m.a.riosv 2017-01-13 09:30:38 UTC
*** Bug 105274 has been marked as a duplicate of this bug. ***
Comment 15 m.a.riosv 2017-01-13 09:31:54 UTC
*** Bug 93039 has been marked as a duplicate of this bug. ***
Comment 16 V Stuart Foote 2017-01-13 14:05:25 UTC
LibreOffice has provided functional filter import of PDF into Draw (default Open action), and into Impress and Writer or also Draw by import filter selection.

With each filter selected, the rendering to respective document canvas follows the structure of the document as recorded within the PDF and text elements are rendered into styled Text box or Frames. 

The PDF filter(s) do not "reflow" text into Paragraph objects. That would require a very complex treatment of the PDF structure to reliably extract syntax and layout--at the expense of fidelity rendering the PDF document.

Replacing of supplementing the PDF filters to provide "reflow" back into paragraphs is seen as out-of-scope for the project as we are not a PDF editor.

The core PDF filters and function are sufficient to our needs of high rendering fidelity.

This is fertile ground for an extension.