Bug 106237 - Exporting document with large table to PDF from command line causes table to be truncated
Summary: Exporting document with large table to PDF from command line causes table to ...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Michael Stahl
QA Contact:
URL:
Whiteboard: target:5.4.0 target:5.3.3
Keywords: bibisected, bisected, filter:pdf
Depends on:
Blocks:
 
Reported: 2017-02-28 16:00 UTC by Matthew Kogan
Modified: 2017-04-05 14:59 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Schedule of rent arrears.odt (16.54 KB, application/vnd.oasis.opendocument.text)
2017-02-28 16:00 UTC, Matthew Kogan
Details
my output (correct) (40.08 KB, application/pdf)
2017-03-01 10:18 UTC, Xisco Faulí
Details
Exported with 5.3.0.3.pdf (39.01 KB, application/pdf)
2017-03-01 13:50 UTC, Matthew Kogan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Kogan 2017-02-28 16:00:05 UTC
Description:
A document with a large table (5 columns and 64 rows) is truncated when I export it to PDF with the --convert-to command line.

Steps to Reproduce:
From the command line run
soffice.exe --convert-to pdf "Schedule of rent arrears.odt"
on the attached document.

Actual Results:  
The table is truncated so that the last row is missing.

Expected Results:
The table should be exported in its entirety.


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0
Comment 1 Matthew Kogan 2017-02-28 16:00:49 UTC
Created attachment 131540 [details]
Schedule of rent arrears.odt
Comment 2 Xisco Faulí 2017-03-01 10:18:34 UTC Comment hidden (obsolete)
Comment 3 Xisco Faulí 2017-03-01 10:18:59 UTC
Created attachment 131550 [details]
my output (correct)
Comment 4 Matthew Kogan 2017-03-01 10:40:58 UTC
Yes, it still fails with 5.3.0.3.
Comment 5 Xisco Faulí 2017-03-01 13:48:31 UTC
Could you please attach you output pdf document ?
Comment 6 Matthew Kogan 2017-03-01 13:50:31 UTC
Created attachment 131556 [details]
Exported with 5.3.0.3.pdf

Attached.
Comment 7 Xisco Faulí 2017-03-01 13:55:42 UTC
I do confirm it's only happening when using command line.

Version: 5.4.0.0.alpha0+
Build ID: d3676ceeec55a41337ce5e6bc596f4f100d0638e
CPU threads: 4; OS: Linux 4.8; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: group
Comment 8 Xisco Faulí 2017-03-01 14:10:13 UTC
Regression introduced by:

author	Michael Stahl <mstahl@redhat.com>	2016-06-09 13:52:16 (GMT)
committer	Michael Stahl <mstahl@redhat.com>	2016-06-09 13:59:19 (GMT)
commit	c488214817516c13603deb1c180fef02f4c700bf (patch)
tree	f139da3173a9bf65a67d7d575af5d1ddc6a9d07a
parent	6a5cb3dae1760283c2c9156de666964ea4794f0f (diff)
tdf#96089 sw: fix scope of bBreakAfter in InsertCnt_()
The problem is that bBreakAfter is passed by reference to SwLayHelper
and stored as a reference member there, so it has to live at least as
long as pPageMaker.  (Unfortunately C++ can't statically check that.)

This then somehow caused the number of pages created after initial load
to be 812 instead of the correct 396 determined from the layout-cache in
the bugdoc, and that then caused Drawing objects to move backward during
the following re-pagination, and then SwDrawContact::Changed_() calls
SetFlyFrmAttr() and that sets the document to modified, which triggers the
AutoSave that was reported in the bug.

Adding Cc: to Michael Stahl
Comment 9 Michael Stahl 2017-03-31 15:54:02 UTC
this is not a regression but a pre-existing condition that already
affects the LO 3.3.0 release.

the PDF filter calls CalcLayout via 

#0  0x00007f83d14224be in SwViewShell::CalcLayout() (this=this@entry=0x2d30ec0) at sw/source/core/view/viewsh.cxx:945
#1  0x00007f83d0fd90e0 in SwEditShell::CalcLayout() (this=0x2d30ec0) at sw/source/core/edit/edws.cxx:96
#2  0x00007f83d172d9a9 in SwXTextDocument::getRendererCount(com::sun::star::uno::Any const&, com::sun::star::uno::Sequence<com::sun::star::beans::PropertyValue> const&) (this=0x2a2da40, rSelection=..., rxOptions=...) at sw/source/uibase/uno/unotxdoc.cxx:2572
#3  0x00007f83ccf37c7d in PDFExport::Export(rtl::OUString const&, com::sun::star::uno::Sequence<com::sun::star::beans::PropertyValue> const&) (this=this@entry=0x7fffaf253a40, rFile=..., rFilterData=uno::Sequence of length 32 = {...}) at filter/source/pdf/pdfexport.cxx:874


... but somehow the layout isn't finished when CalcLayout returns.
Comment 10 Michael Stahl 2017-03-31 21:17:39 UTC
the document contains a "layout-cache" that is completely bogus:

debug:26706:1: nType P nIndex 29 2147483647
debug:26706:1: nType P nIndex 66 2147483647
debug:26706:1: nType P nIndex 105 2147483647
debug:26706:1: nType P nIndex 142 2147483647
debug:26706:1: nType P nIndex 178 2147483647
debug:26706:1: nType P nIndex 205 2147483647
debug:26706:1: nType P nIndex 229 2147483647
debug:26706:1: nType T nIndex 314 65535

a lot of these paragraphs are either inside tables
(in which case the cache must contain T instead)
or structural nodes of the table like end-nodes.

mystery how that happened.

meta.xml has meta:generator:
OpenOffice/4.1.1$Win32 OpenOffice.org_project/411m6$Build-9775

but when i use AOO 4.1.1 on Linux to load&store the
document, i get a sensible layout-cache like so:

debug:26765:1: nType T nIndex 382 65535
debug:26765:1: nType T nIndex 790 65535

there are 64 tables in the document, each with 1 row

iteration over the pages in SwLayAction::InternalAction()

1 2 1 3 4 * 1 ...

the * is where the disaster happens: before that point
page 4 has 46 tables on it; then we hit the loop-control
in line 556 *twice* and so move only 41 tables backward
instead of all of them.

after that point we have pages with tables:

1) 21
2) 24
3) 13
4) 5

but page 4 is marked as Valid (due to the loop control),
so it is not formatted again during this LayAction.

when you use the UI then something will surely initiate
another layout action to finish the formatting (the print
dialog triggers about 3), but not so with --convert-to.
Comment 11 Michael Stahl 2017-04-03 12:50:58 UTC
added a sanity check for the layout-cache, works great for this particular bugdoc but of course can't detect every problem

fixed on master
Comment 12 Commit Notification 2017-04-03 12:51:49 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=8a5374f2fdbd1e15c107133f55930cbc431edbd5

tdf#106237 sw: do some basic sanity checking on layout-cache

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2017-04-05 14:59:46 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=137ad218db262fb3531215adbc88b7093b4999c7&h=libreoffice-5-3

tdf#106237 sw: do some basic sanity checking on layout-cache

It will be available in 5.3.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.