Bug Hunting Session
Bug 75573 - VIEWING: Writer inserts an unwanted page break before a text frame in DOCX
Summary: VIEWING: Writer inserts an unwanted page break before a text frame in DOCX
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.1.3.2 release
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: BSA interoperability target:5.3.0 tar...
Keywords: bibisected, filter:docx
Depends on:
Blocks:
 
Reported: 2014-02-27 13:33 UTC by Vincent Bossier
Modified: 2017-09-19 19:08 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
The archive contains a docx file produced with MS Word 2007 and the corresponding PDF file. (27.25 KB, application/x-gzip)
2014-02-27 13:33 UTC, Vincent Bossier
Details
non-archived MS Word 2007 docx test file (28.80 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2016-07-27 19:39 UTC, Luke
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vincent Bossier 2014-02-27 13:33:41 UTC
Created attachment 94816 [details]
The archive contains a docx file produced with MS Word 2007 and the corresponding PDF file.

Writer inserts an unwanted page break before a text frame that is supposed to lie at bottom of page one.

How to reproduce:

The attached unwanted_page_break.docx document (MS Office 2007) contains a page on which you will find a paragraph on top and a text frame at bottom. MS Word displays and prints both texts on the same page, as seen in the attached unwanted_page_break.pdf file. This file has been generated from MS Word 2007.

Writer 4.2.1.1 places the text frame at the top of a new page and shows a page break between pages 1 and 2. Writer 4.1.3.2 shows the same problem.

Current behavior:

The text frame is at top of page 2.

Expected behavior:

The text frame is at bottom of page 1.
              
Operating System: Ubuntu
Version: 4.2.1.1 release
Comment 1 Alexandr 2014-03-01 18:51:45 UTC
Thank you for reporting the bug.

Reproduced with LibreOffice 4.1.4.2, 4.2.0.4 and 4.3.0.0.alpha0+ Build ID: db0222881be20744c071be451d77a7dc4a0dbb56 TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-02-21_10:31:05

I set version to 4.1.3.2 because this field contains the oldest version with a bug.

In OpenOffice.org 3.3.0 the text frame is in the bottom of the first page, but misplaced in front of footer.
In LibreOffice 3.5.4 the text frame is in the top of the first page.
Comment 2 Alexandr 2014-03-02 14:00:37 UTC
Also reproducible with LibreOffice 4.2.1.1 on Windows 7. I set platform to all.
Comment 3 QA Administrators 2015-09-04 02:49:23 UTC Comment hidden (obsolete)
Comment 4 Vincent Bossier 2015-09-09 21:10:46 UTC
The behavior has changed.

I tested the bug against LibreOfficeDev 5.1.0.0.alpha1+ Build ID: cf9fbdb379e2935677a73ced513d7faf855c299c under Ubuntu 14.04 64-bit.

There is no more page break but the top paragraph containing "SOME TEXT" is now inside the frame, which is at the top as well. The frame should remain separated and at the bottom of the page (see PDF in attached archive).
Comment 5 Vincent Bossier 2015-09-12 10:26:10 UTC
I tested the behavior with the standard LibreOffice package coming with Ubuntu 14.02 64-bit, i.e. LibreOffice 4.2.8.2 and the behavior is the same as the previous log (LO 5.1.0.0.alpha1+).

One additional remark: log #1 states that the old behavior was present on version 4.3.0.0.alpha0+ but version 4.2.8.2 shows the new behavior.
Comment 6 Robinson Tryon (qubit) 2015-12-14 06:01:07 UTC Comment hidden (obsolete)
Comment 7 Justin L 2016-07-05 06:48:23 UTC
Note1: there is a page break in the document, so the page break itself is wanted, but the placement of the textframe is on the unwanted page (and in the unwanted position of the top of the page (vertical 0.0) instead of at the bottom (7.x).

The latest problem was that the text frame was removed, caused by this patch between 4.3 and 4.4 with commit 5510f563502168defa4ccfc54214d781a7c92868
    Author:     Luboš Luňák <l.lunak@collabora.com>
    CommitDate: Sat May 24 00:41:49 2014 +0200
    discard more header/footer stuff when discarding headers/footers (bnc#875718)

Note2: the frame is NOT in the header
Comment 8 Justin L 2016-07-05 08:22:12 UTC
The frame moved from page1 to page2 somewhere between 3.6 and 4.0 from
good 2012-06-25 13:19:03 (GMT) merge 3 copy and paste efforts back together as bestFitOpenSymbolToMSFont
to bad commit 6263315825e01e766668b9ce5d2eb52e71e051a7
    Author:     Tor Lillqvist <tlillqvist@suse.com>
    CommitDate: Wed Jun 27 12:56:31 2012 +0300
    Whitespace cleanup

I couldn't find a particular commit in this range that looked to be the likely suspect.
Comment 9 Justin L 2016-07-06 09:21:10 UTC
(In reply to Justin L from comment #8)
> I couldn't find a particular commit in this range that looked to be the
> likely suspect.

It came from the addition of bRemove in DomainMapper.cxx.
Comment 10 Justin L 2016-07-06 09:21:57 UTC
(In reply to Justin L from comment #7)
Proposed fix:  https://gerrit.libreoffice.org/26972
Comment 12 Justin L 2016-07-23 06:52:54 UTC
Proposed fixes:
Fix to place the frame at the bottom of the page:  https://gerrit.libreoffice.org/27453 tdf#75573 - docx handle frame properties at styles

Fix to anchor to the page and not to the Margin:   https://gerrit.libreoffice.org/27454 tdf#75573 allow style to define vAnchor

Fix for comment 8 - keep frame on first page:  https://gerrit.libreoffice.org/27455 tdf#75573 - docx don't remove frame anchor paragraph

This document now imports nicely, but it still doesn't round-trip very well - information is lost multiple times when re-saving as docx.
Comment 13 Commit Notification 2016-07-25 06:16:32 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=9920a0bf9d783978cd6f7b97f7528d8aa2571143

tdf#75573 - docx handle frame properties at styles

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2016-07-26 11:40:56 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=eb345a155bc9cb92fffd3e5ea0269207b3bac0f1

tdf#75573 allow style to define vAnchor

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2016-07-27 12:56:04 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=91ad1017b609be6fceccd392006dd9ab60724352

tdf#75573 - docx don't remove frame anchor paragraph

It will be available in 5.3.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Luke 2016-07-27 19:39:28 UTC
Created attachment 126440 [details]
non-archived MS Word 2007 docx test file

Single test files should not be stored in archive formats. Additional screenshots or pdfs should be attached separately.
Comment 17 Luke 2016-07-27 20:29:21 UTC
With: 5f65ca15a2297f298536d07cfa8564a1f7c67abb

I'm getting a 'Write-protected content cannot be changed.' read-only error when I try to edit the document. Word and earlier versions can edit it.
Comment 18 Justin L 2016-07-28 03:30:39 UTC
(In reply to Luke from comment #17)
> I'm getting a 'Write-protected content cannot be changed.' read-only error
> when I try to edit the document. Word and earlier versions can edit it.

That fix is already queued up.  It was a recent 5.2 regression / partially implemented feature.
Comment 19 Justin L 2016-08-01 04:23:30 UTC
Round-tripping (comment 12) is broken by a regression from bug 80748 which I have reopened.  Workaround revert proposal: https://gerrit.libreoffice.org/27643 which isn't the greatest since it re-introduces potentially corrupt documents in MSWord in exchange for not losing data.

Marking this bug as fixed.
Comment 20 Justin L 2016-12-17 14:44:23 UTC
The fix in comment 15 caused a regression: bug 104714
Comment 21 Justin L 2016-12-20 19:04:26 UTC
My fix in comment #11
> tdf#75573 docx - complete frames before starting alternate streams
> An unused odd header was set to be discarded.  The handling of
> unregistered frames occurred at the same time, and thus ended up
> being discarded as well.
> Since a frame shouldn't encompass both the alternate stream
> and the current stream, finalize any unfinished frames first.

caused a regression in attachment 120911 [details] (Schindler Excellence.docx from bug 97417) where one of the tables merges with the following column around page7.
Comment 22 Commit Notification 2016-12-22 05:17:49 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=34324bff1252dc5a51c9408f9502654453f319b6

tdf#75573 - relocate code: alternate stream already started

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 23 Commit Notification 2016-12-23 04:21:55 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-5-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4cbe2e712bab42e95fb55d78da6b1daf326f7f1b&h=libreoffice-5-3

tdf#75573 - relocate code: alternate stream already started

It will be available in 5.3.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 24 vihsa 2017-03-21 06:54:44 UTC
verified.
version: 5.4.0.0.alpha0+ / build id: febc116 / ls-4001 / android 5.1

the text frame is at bottom of page 1.