Created attachment 114887 [details] Bugged out file originating from MSO Somehow LibO displays "manual column breaks visible" in the headers. Don't know, if it's related to the layout/flow being broken. Win 8.1 32-bit MSO 2013 LibO Version: 4.5.0.0.alpha0+ Build ID: 211c12b9c64facd1c12f637a5229bd6a6feb032a TinderBox: Win-x86@39, Branch:master, Time: 2015-04-18_00:35:20 Locale: fi_FI
Created attachment 114888 [details] PDF exported from MSO 2013 Compare to this.
Interesting: 3.3 and 3.5 don't look as bad. 3.3.0 has the right page count (7) and 3.5.0 has one extra page. With 3.6.7, the page count has exploded to 17. I'm adding a bibisect request and hope the layout lengthening issue can be spotted at least. Ubuntu 14.10 64-bit LibreOffice 3.3.0 OOO330m19 (Build:6) tag libreoffice-3.3.0.4 LibreOffice 3.5.0rc3 Build ID: 7e68ba2-a744ebf-1f241b7-c506db1-7d53735 Version 3.6.7.2 (Build ID: e183d5b)
I can confirm this, LO adds "manual column breaks" on top of page 2, 6, 8, 10 and 12. Linux / Arch x64 Version: 4.4.2.2 Build-ID: 4.4.2.2 Arch Linux build-1 Locale: de_DE
Bug 63662 deals with the text "manual column break" being displayed. I just verified that the bibisect result in comment 11 of that bug report is also valid for the document attached here. When opening the document in the respective LibreOffice version, the document still has only 7 pages, so the fact that the amount of pages increases when opening the document in LibreOffice is another issue.
(In reply to Michael Weghorn from comment #4) > Bug 63662 deals with the text "manual column break" being displayed. I just > verified that the bibisect result in comment 11 of that bug report is also > valid for the document attached here. So I remove that from the summary (since 63662 handles it) > When opening the document in the respective LibreOffice version, the > document still has only 7 pages, so the fact that the amount of pages > increases when opening the document in LibreOffice is another issue.
Migrating Whiteboard tags to Keywords: (bibisectRequest) [NinjaEdit]
Bibisect of when the page count jumped to >17 pages...it fluctuates a bit in between (7-8 pages): 5b4693bb72eca5e38e3f56d036bca425c9a21b37 is the first bad commit commit 5b4693bb72eca5e38e3f56d036bca425c9a21b37 Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com> Date: Sun Dec 9 11:49:31 2012 +0000 source-hash-e3633f60b349022994e291aa3d1a0c90c3403b2e commit e3633f60b349022994e291aa3d1a0c90c3403b2e Author: Stephan Bergmann <sbergman@redhat.com> AuthorDate: Wed May 16 09:32:51 2012 +0200 Commit: Stephan Bergmann <sbergman@redhat.com> CommitDate: Wed May 16 09:36:38 2012 +0200 fdo#46074 fdo#49948 Ignore corrupted items in Recent Documents ...following up on 4ccb4bda483eb548eb6efb5e2f1952f094522320 "fdo#46074 Ignore corrupted items in Recent Documents" with another problematic scenario found with fdo#49948. Change-Id: I3e7c803813f09c1f031defc2c18cfab6732b1621 :100644 100644 5aa1dfc68ecb9ac57316a995424b2d3683cb4774 aa42f04f09d97d387333244ba505d2fd3c3086c2 M autogen.log :100644 100644 72da0ea5e9ec1223cb456558a2e0254561faa98c 1829a020e51322ed60e655809575a93edd3b9032 M ccache.log :100644 100644 5ef3324ce1c257155c9e095fdeb7d912b2681ae1 795d8ec3e2d59c5f0a85099dac7224954a57c4f2 M commitmsg :100644 100644 8b14489bddefe04fcfaecb0be901837505c64b67 5e870f27775bef1e12288b413b09a4052c414870 M dev-install.log :100644 100644 68ac6a90c73f1f7c8776a70772a40ae1ce41e13d 78b57ac998248d89343563f89455faeeea3f57a1 M make.log :040000 040000 8b906c6863615fd1253b393b35b18a883201b310 e793bfa8b661936460e69be1537f15a7e99d3289 M opt # bad: [423a84c4f7068853974887d98442bc2a2d0cc91b] source-hash-c15927f20d4727c3b8de68497b6949e72f9e6e9e # good: [65fd30f5cb4cdd37995a33420ed8273c0a29bf00] source-hash-d6cde02dbce8c28c6af836e2dc1120f8a6ef9932 git bisect start 'latest' 'oldest' # bad: [e02439a3d6297a1f5334fa558ddec5ef4212c574] source-hash-6b8393474974d2af7a2cb3c47b3d5c081b550bdb git bisect bad e02439a3d6297a1f5334fa558ddec5ef4212c574 # bad: [8f4aeaad2f65d656328a451154142bb82efa4327] source-hash-1885266f274575327cdeee9852945a3e91f32f15 git bisect bad 8f4aeaad2f65d656328a451154142bb82efa4327 # good: [369369915d3582924b3d01c9b01167268ed38f3b] source-hash-45295f3cdceb4c289553791071b5d7f4962d2ec4 git bisect good 369369915d3582924b3d01c9b01167268ed38f3b # bad: [6fce03a944bf50e90cd31e2d559fe8705ccc993e] source-hash-47e4a33a6405eb1b5186027f55bd9cb99b0c1fe7 git bisect bad 6fce03a944bf50e90cd31e2d559fe8705ccc993e # good: [8a39227e344637eb7154a10ac825d211e64d584c] source-hash-f5080ebb7022c9f5d7d7fdca4fe9d19f9bb8cabf git bisect good 8a39227e344637eb7154a10ac825d211e64d584c # bad: [e4c742a9e244bd7ebeabc50c90182df28ac3daaf] source-hash-c52ba433491afbca70aa1977a624c795bdd5b9ef git bisect bad e4c742a9e244bd7ebeabc50c90182df28ac3daaf # good: [96a055e15ee7171a28888973a3c3a7307dd9867f] source-hash-9ca02a663c3eee2698eb360dd5dc7afb1951e743 git bisect good 96a055e15ee7171a28888973a3c3a7307dd9867f # bad: [e87a0055deae2c9e25ae1d1a365cec8418b785ce] source-hash-67ff63988f3b8eef2cc2b5bdf917918b93c3f070 git bisect bad e87a0055deae2c9e25ae1d1a365cec8418b785ce # bad: [5b4693bb72eca5e38e3f56d036bca425c9a21b37] source-hash-e3633f60b349022994e291aa3d1a0c90c3403b2e git bisect bad 5b4693bb72eca5e38e3f56d036bca425c9a21b37 # good: [d101b9946a6a04e65e3923038503436c790b7e12] source-hash-18e6e7d929c2be209407ed2e56b8ec4d5e6c4900 git bisect good d101b9946a6a04e65e3923038503436c790b7e12 # first bad commit: [5b4693bb72eca5e38e3f56d036bca425c9a21b37] source-hash-e3633f60b349022994e291aa3d1a0c90c3403b2e
I believe the first bad commit will be author Miklos Vajna <vmiklos@suse.cz> 2012-05-15 06:56:38 (GMT) commit 50cb1667020494906afaacb68d4163d1eda527cf fdo#49940 dmapper: handle m_bTitlePage when m_nBreakType is zero The problem in buggy2.docx is that headers are redefined many times. In word, it appears as if headers are a section property while in LO they are a page style property. Thus, in LO, a new style is created for each section that has a different header (and thus continuous breaks are converted into page breaks). This also causes the document to round-trip terribly - since all continuous breaks have been removed, the column breaks start affecting the wrong places.
Justin Luth committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=50bf96d31ab2eb546f6c71cc93c1fa5dd4bf3044 tdf#90697 docx - don't change continuous break into page break It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 100513 has been marked as a duplicate of this bug. ***
checked in Version: 5.3.0.0.alpha0+ Build ID: cc503abb860c33a54a188640a5962dbdf7052284 CPU Threads: 2; OS Version: Linux 4.4; UI Render: default; TinderBox: Linux-rpm_deb-x86@71-TDF, Branch:master, Time: 2016-07-04_00:55:33 Locale: nl-NL (nl_NL.UTF-8) Thanks Justin :) !
after this fix, another problem became visible: Bug 100758 - [FILEOPEN] Word Continuous section break results in 2nd and more footnotes being pushed to next page on import of DOCX file (edit)
*** Bug 64407 has been marked as a duplicate of this bug. ***
Justin Luth committed a patch related to this issue. It has been pushed to "libreoffice-5-2": http://cgit.freedesktop.org/libreoffice/core/commit/?id=6f9cbfad8744646b5b1f79d5fbf1c1f9eb03519d&h=libreoffice-5-2 tdf#90697 docx - don't change continuous break into page break It will be available in 5.2.2. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Samuel Mehrbrodt committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=298571e2e10f5a925abc3cb75940dbef5701b583 tdf#90697 unit test for rtf import It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Samuel Mehrbrodt committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=7c3cb23136229c748984b49847af6f729ce3e6ba Revert "tdf#90697 unit test for rtf import" It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 127672 [details] tdf92724_continuousBreaksComplex2.pdf: printout for additional unit test likely someone will want to revert these changes as causing a regression in other documents - something unavoidable. Adding tests to ensure that their solution addresses various complex cases. tdf92724_continuousBreaksComplex2: https://gerrit.libreoffice.org/#/c/29322/
(In reply to Justin L from comment #17) > likely someone will want to revert these changes as causing a regression in > other documents - something unavoidable. Can you please explain which "these changes"?
"these changes" means not treating every header/footer definition as an applied page style (aka page break). This results in data loss - unused h/f are currently lost. (actually, in this complexBreaks2.docx, even the USED page style on the last page is lost in round-tripping, so there is still lots of work to be done on docx section breaks).
Samuel Mehrbrodt committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=ad48da87038bd0ae67c2edb4199813e1a2205a69 Reintroduce "tdf#90697 unit test for rtf import" It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Justin Luth committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=13c0122ad18dd1db187de8afc2ef406421d6d0e5 tdf#90697 unit test for an unused FirstPage header It will be available in 5.3.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 127733 [details] firstInheritTest.docx: continuous breaks are evil A carefully crafted document can look great on older version of LO and terrible in the current one. Continuous breaks need to be replaced with PageBreaks in order to be even remotely compatible.
Created attachment 127734 [details] firstInheritTest.pdf: what it looks like in Word2003