Bug 31606 - Exporting a selection of pages to PDF has pagerange wrong because of blank pages
Summary: Exporting a selection of pages to PDF has pagerange wrong because of blank pages
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high normal
Assignee: Not Assigned
URL: http://www.bielefeldundbuss.de/LibO/3...
Whiteboard:
Keywords: filter:pdf
: 47678 (view as bug list)
Depends on:
Blocks: PDF-Export
  Show dependency treegraph
 
Reported: 2010-11-13 11:13 UTC by Rene Engelhard
Modified: 2018-11-29 09:35 UTC (History)
12 users (show)

See Also:
Crash report or crash signature:


Attachments
Export result, pages 29 to 38 (467.42 KB, application/pdf)
2011-01-13 18:44 UTC, Chalo Alvarez J
Details
Export result: page 29 (101.90 KB, application/pdf)
2011-01-13 18:45 UTC, Chalo Alvarez J
Details
Export result: selection in p29 (96.66 KB, application/pdf)
2011-01-13 18:45 UTC, Chalo Alvarez J
Details
See Comment 13! (162.41 KB, application/pdf)
2011-02-10 00:25 UTC, Rainer Bielefeld Retired
Details
The original test document (632.70 KB, application/vnd.oasis.opendocument.text)
2015-09-24 16:57 UTC, Alexandr
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rene Engelhard 2010-11-13 11:13:08 UTC
From http://bugs.debian.org/603402:

--- snip ---
Steps to reproduce

1. Take a very long document (100+ pages)
2. Export a specific subset of pages in PDF
3. Check if the pages exported are the correct one

LibreOffice export a wrong range of page (say you asked pages 355-360, you may end getting 358-363).

I classify the severity as grave as if you do not check the PDF before sending, you could end sending confidential data to people who should not see it. That is indeed what happened to me when I found this bug.

Here is the URI of a file which is affected by this issue http://www.pitonyak.org/AndrewMacro.odt

If you ask exporting page 355-360, you end up getting 358-363
--- snip ---
Comment 1 Rene Engelhard 2010-11-13 11:44:13 UTC
submitter says that it works in OpenOffice.org 3.2.1
Comment 2 Thorsten Behrens (CIB) 2010-11-18 16:32:12 UTC
Urk. Lubos, any chance you could look into this? AndrewMacro is btw notorious for breaking OOo/LibO. ;)
Comment 3 Luboš Luňák 2010-11-20 07:11:41 UTC
3.2.1 is just as broken here.
Comment 4 Rainer Bielefeld Retired 2010-11-20 09:18:32 UTC
[Reproducible] with "LibreOffice 3.3.0Beta3 - WIN XP DE [OOO330m9 (build 3.2.99.2)]" and my own master documents with app. 90 pages. 

It's not simply reproducible with a 50 pages text document with a word art in it, but with documents containing complex layout. 
I tried attached app 850 pages "Lorem Ipsum" document with some fontwork and was able to reproduce the problem (or something similar?).

When I open the document, I see on page 830 a heading in the middle of the page 
announcing that this would be page 830. When I try to export this page I got attached sample2.pdf, what shows an other page.
During / after export I observed some "Neuformatierung ..." in the status line, and after that my page with heading had page 833. This difference might have caused the export of the wrong page.
Comment 5 Rainer Bielefeld Retired 2010-11-20 09:59:47 UTC
Please see test kit under a.m. URL!
Heading should be on page 830, but export of page 830 will show an other page and will "shift" the heading to an other page No (833)
Comment 6 Rainer Bielefeld Retired 2010-11-21 21:50:25 UTC
I believe my test kit shows some other problem than the reported one.
Currently PDF export from my Master Documents is completely unusable because of that wrong page export problem. My Documents have app. 50 pages and the effect differs from the one in comments from  Rainer Bielefeld 2010-11-20, Page numbers will not be modified after export.
I will try to create a sample document.
Comment 7 Chalo Alvarez J 2011-01-03 17:48:37 UTC
I can confirm that this issue remains in LibO RC2 for windows.
Comment 8 Chalo Alvarez J 2011-01-13 18:44:32 UTC
Created attachment 41990 [details]
Export result, pages 29 to 38
Comment 9 Chalo Alvarez J 2011-01-13 18:45:04 UTC
Created attachment 41991 [details]
Export result: page 29
Comment 10 Chalo Alvarez J 2011-01-13 18:45:43 UTC
Created attachment 41992 [details]
Export result: selection in p29
Comment 11 Chalo Alvarez J 2011-01-13 18:45:58 UTC
I tried this on the latest RC3 for windows and the results are as follows:

1) Exported pages 29 to 38
   => Got pages 30 to 39, with bookmarks from pages 29 to 38
2) Exported just page 29
   => Got page 30, with bookmarks of page 29
3) Exported just a selection of page 29
   => Got the right selection exported, but numbered list reset to 1. Bookmarks where also reset but otherwise where OK.

So there seems to be an offset somewhere in the page rendering that does not occur when generating the bookmarks.
Comment 12 Chalo Alvarez J 2011-01-17 17:27:00 UTC
I think that fixing this bug is of high importance..
Comment 13 Rainer Bielefeld Retired 2011-02-10 00:22:46 UTC
I am able to reproduce Chalo Alvarez J's result with attachment for 
Bug 34093 - Partial PDFEXPORT of particular Master Documents breaks hyperlinks

Steps to reproduce:
0. Unzip test kit
1. Open "MasterSampleAE.odm"
2. Go to range" _fehlersuche_Saia_WEB" (=physical page 5 ... 14)
3. Menu 'File > Print'
   Print dialog opens
4 Insert Page No 5 in Page selection pane below preview
  expected: Start page of range should be shown
  actual: second page of range shown, to get the start page, you have to #
          select page 4
          Print or export tests confirm, you will have to select pages 4 ...13
          to get an output from pages 5 ... 14

I will attach a screenshot with comments.

It's the same problem for printing and export to PDF

And yes, that problem has to be solved quickly, it's a regression to OOo 3.1.1, but I see the same problem with OOo 3.4-dev. Because OOo versions later than 3.1.1 were not suitable for my dayle use, I can't tell with what version the problem started.

And yes, especially for EXPORT (or fax printing) where you do not always see the result before the recipient will have seen it, this problem is critical.
Comment 14 Rainer Bielefeld Retired 2011-02-10 00:25:57 UTC
Created attachment 43188 [details]
See Comment 13!
Comment 15 Rainer Bielefeld Retired 2011-04-14 05:34:08 UTC
Modified expected Version due to date of report.
Modified Status due to facts
Comment 16 Peter Jentsch 2011-06-06 10:07:24 UTC
Hi Lubos, I'd like to try to work on this bug, do you alread have any hints? I'm afraid the cause is not directly in pdfwriter.cxx but in the writer selection code, is that correct? 

Cheers, 
Peter
Comment 17 Luboš Luňák 2011-07-04 08:26:13 UTC
I'm afraid the cause the code as a whole, because the code related to this seems to be an awful mess. It's that bad that I don't even remember much anymore, but the problem was simply that there are several places where the page range is handled and they seem to be not quite related to each other. The basic problem that is causing this bug is that removing empty pages means adjusting the pages to print and each of these places handle it differently (or not at all).

For starters you could go to http://opengrok.libreoffice.org and search for "PageRange" and "aPageRange", that shows some of those places. Not all, and not all shown are relevant here, from what I remember:
- sw/source/ui/uno/unotxdoc.cxx - notice that it gets the page range from the pdf dialog; IIRC this place is not directly relevant to this bug, but you can see that the SwEnhancedPDFExportHelper usage has bIsSkipEmptyPages passed and SwEnhancedPDFExportHelper::CalcOutputPageNum() adjusts pages according to this
- filter/source/pdf/pdfexport.cxx - this converts the page range to a selection of pages to print and IIRC in git log I could find that older version of the code did some adjustments there
- sw/source/core/doc/doc.cxx - there is some page range handling done too, but IIRC when I tried to trivially already fix it there, code that was called after this expected to still have unadjusted range and broke

You'll probably also need to spend some time tracking what calls what and how it handled the page range, either in a debugger or with debug output, and the sequence of calls was IIRC nowhere near trivial.

I'm sorry I cannot provide more information. This will possibly require finding out how the whole page range thing is handled and then maybe the simplest fix will be rewritting it to be more sensible. If you will need more help I suggest asking on the libreoffice-devel mailing list as probably somebody of the developers who have spent more time with this codebase know better how the print handling works.
Comment 18 Nicolas Degand 2011-08-08 14:31:31 UTC
I have found a workaround to get the correct behaviour. In the "PDF Options" dialog box, you need to tick the "Export automatically inserted blank pages". This should be turned on by default.

The behaviour if you do not select this option is however still erratic. It should remove the blank pages, but not shift the page range in unpredictable ways.
Comment 19 Michael Meeks 2011-09-05 09:26:13 UTC
thoughts on changing the default here appreciated. Ultimately, the whole 'Page' thing is a nightmare - people would presumably assume that page numbers could easily mean the numbers printed on each page - but that too is not necessarily so ;-)
Comment 20 Christoph 2011-09-06 23:04:57 UTC
Michael asked on libreoffice-ux-advice and I provided a more general answer on page numbers and their use for print and export (I hope I got the problem right). In a nutshell: changing the "empty pages" export option will cause other problems.

http://lists.freedesktop.org/archives/libreoffice-ux-advise/2011-September/000266.html
Comment 21 Christoph 2011-09-06 23:37:35 UTC
(In reply to comment #18)
> I have found a workaround to get the correct behaviour. In the "PDF Options"
> dialog box, you need to tick the "Export automatically inserted blank pages".
> This should be turned on by default.
> 
> The behaviour if you do not select this option is however still erratic. It
> should remove the blank pages, but not shift the page range in unpredictable
> ways.

Interesting. The PDF export dialog and print dialog differ in terms of page number handling. The print dialog bases upon document page numbers (and ignores empty pages), the PDF export dialog first removes the empty pages and recalculates the page range to be selected from.

At least this should be harmonized to work like printing.
Comment 22 tommy27 2011-11-18 23:59:54 UTC
this is LibO 3.4.x oldest most annoying bug.
I wonder if there has been any progress fixing it.
Comment 23 Ivan Timofeev (retired) 2012-03-12 09:06:08 UTC
So... Summing this up.

1. PDF export is not buggy - it deliberately has the different (from printing) default setting: unchecked "Export automatically inserted blank pages". If you uncheck the respective "Print automatically inserted blank pages" in the print dialog - you will have the same confusing result.

2. The page counting is confusing - it do not count "automatically inserted blank pages" (for example, AndrewMacro: page 24). 

What to do:

* Change the page counting logic: count blank pages in any case.
  Requirements:
   * Handle these blank pages in the print dialog.
   * Provide a preview for the PDF export dialog.
      -> Reuse the print dialog. But:
          - Fit all the PDF option set (a lot of optional tab pages?).
          - No printer selection.
          - No "Number of copies"
          - No "page layout" tab.
         [- And no knowledge of the code around.]

This definitely does not fit into a bugfix release. :( 
  - A task for 3.6? 
  - Skip my "requirements" for now?

--
* PDF export dialog: filter/source/pdf/impdialog.cxx
* Print dialog: vcl/source/window/printdlg.cxx
Comment 24 Florian Reisinger 2012-03-25 07:17:02 UTC
Someone, who can tet it??

Thanks
Comment 25 Rainer Bielefeld Retired 2012-03-27 04:11:59 UTC
(In reply to comment #23)
> So... Summing this up.

We will limit this bug to the summarized problem. For anything else not related to the "blank pages" thingy I will created different reports and obsolete Attachments here (for example my "See Comment 13!").
Comment 26 Rainer Bielefeld Retired 2012-04-08 04:52:14 UTC
*** Bug 47678 has been marked as a duplicate of this bug. ***
Comment 27 Rainer Bielefeld Retired 2012-04-10 06:13:41 UTC
NOT reproducible with "LibreOffice 3.5.2.2 German UI/Locale [Build-ID: 281b639-6baa1d3-ef66a77-d866f25-f36d45f] on German WIN7 Home Premium (64bit) and Debian sample from <http://www.pitonyak.org/AndrewMacro.odt>

I opened document and printed first Index Page (for me Page number 495) as sheet 513 (shown in status bar) to FreePDF, 1x with blank pages, 1x wo. blank pages. Both times correct contents shown in result.pdf

We had so many theories, Summary modifications here, May be it would be use this one here and open new Bugs with clear limitations for related problems still 
existing with 3.5.2.

3.4 lifecycle is terminated

@All:
What do you think about closing this Bug as suggested?
Comment 28 tommy27 2012-04-10 11:52:39 UTC
if it's not reproducible with 3.5.2 it should be closed
Comment 29 sasha.libreoffice 2012-08-21 10:54:00 UTC
Thanks for additional testing
Due to last comment, changing status to WorksForMe

If problem will appear again, please, change status to Reopened
Comment 30 Rainer Bielefeld Retired 2012-08-21 14:48:15 UTC
But It still does not really work for me, I still have problems with PDF export, have to select other pages than shown in preview for print. But that seems to have other reasons, I will have to submit a different bug for that.
Comment 31 Alexandr 2015-09-24 16:57:11 UTC
Created attachment 118999 [details]
The original test document

I attached the test document from description in case it would be unavailable from the external link.
Comment 32 Alexandr 2015-09-24 17:05:24 UTC
I can reproduce this issue with LibreOffice 5.0.1 from Debian. I am sure that it is a bug because current behaviour is unexpected.

Anyway, the importance of this bug is not high critical.
Comment 33 Buovjaga 2015-09-28 08:47:44 UTC
NEW per comment 32.
Comment 34 Xisco Faulí 2016-09-10 16:10:45 UTC
Hi Lubos,
I'm setting this ticket back to NEW as it has been inactive for more than 3
months.
Feel free to assign it back to you if you're still working on this.
Regards
Comment 35 David 2016-09-10 20:31:21 UTC
I would guess that bug 95658 is related to this bug.  The whole problem seems to be that LO wants to keep re-formatting every time something is done.  If a page update is done by doing a Tools | Update | Page formatting, then why does it need to reformat, and not only just re-format, but change the previous format and therefore the page numbers every time a document is printed or exported to PDF?  You should be able to do a page format and be able to depend on the results of that for what it will look like when you print or export to PDF, but you can't.
Comment 36 QA Administrators 2018-09-09 02:39:23 UTC Comment hidden (obsolete)
Comment 37 Nicolas Degand 2018-09-11 20:57:18 UTC
Still present in LibreOffice 6.1.0.3 (W10/x64)
Comment 38 YaWen 2018-11-29 09:35:04 UTC
Still exists in version LibreOffice 6.3.0.0.alpha0+(x64)