Bug 86453 - PDF import: textbox layout should match reading order
Summary: PDF import: textbox layout should match reading order
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
(earliest affected)
Hardware: All All
: medium enhancement
Assignee: Not Assigned
Depends on:
Blocks: PDF-Import-Draw
  Show dependency treegraph
Reported: 2014-11-19 04:54 UTC by Urmas
Modified: 2018-05-23 19:55 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:

PDF (107.69 KB, application/x-pdf)
2014-11-19 04:54 UTC, Urmas
Import result (28.15 KB, application/vnd.oasis.opendocument.graphics)
2014-11-19 04:55 UTC, Urmas

Note You need to log in before you can comment on or make changes to this bug.
Description Urmas 2014-11-19 04:54:28 UTC
Created attachment 109712 [details]

Importing the attached PDF file creates the textboxes in wrong order.

The reading order is expected.
Comment 1 Urmas 2014-11-19 04:55:21 UTC
Created attachment 109713 [details]
Import result
Comment 2 Joel Madero 2014-11-20 15:48:13 UTC
Really strange - but...confirmed.

Ubuntu 14.10 x64
LibreOffice release

To Reproduce:
1. Download "Import result"
2. Start pushing tab repetitively

Observed: You see that the text boxes that get selected are not in the order that you'd expect, they jump around the page.

Thanks Urmas!
Comment 3 vvort 2015-01-17 15:19:17 UTC
Open this file in Adobe Reader, move text cursor to the start of the page and start pressing down arrow on keyboard: you will see the same strange jumps.
I doubt if we need to fix the problem, which is not fixed even by Adobe.
Comment 4 Urmas 2015-01-17 19:09:14 UTC
Still, various PDF-to-text tools works well.