Bug 92521 - FILESAVE: loss of page breaks after saving DOCX
Summary: FILESAVE: loss of page breaks after saving DOCX
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.0.4 release
Hardware: All All
: highest major
Assignee: Miklos Vajna
URL:
Whiteboard: target:5.1.0 target:5.0.3 target:4.4.6
Keywords: bibisected, bisected, regression
Depends on:
Blocks:
 
Reported: 2015-07-03 10:40 UTC by Leigh Bowden
Modified: 2016-10-25 19:21 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Trashed document. (164.42 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-07-03 10:40 UTC, Leigh Bowden
Details
This is the original downloaded document. (333.94 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-07-03 10:42 UTC, Leigh Bowden
Details
Wow. It's done it again with a newly downloaded docx. (162.95 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-07-03 11:06 UTC, Leigh Bowden
Details
Original PDF document from MS Office- 6 pages: 3 portrait, 3 landscape (285.82 KB, application/x-force-download)
2015-07-06 16:02 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Leigh Bowden 2015-07-03 10:40:31 UTC
Created attachment 117017 [details]
Trashed document.

I'm a big advocate of LibreOffice but it's very hard to extoll it's virtues when it complete trashes a document.

I downloaded a job application which was a docx file. It is 6 pages long.

Started to edit it and completed about half of it saving regularly and everything looked fine but left it as docx. There was no unexpected shutdowns and everything was quit out of cleanly.

This morning I opened the document to complete the application and it is now one page long and basically has my name and address remaining and that's about it. 

Even MS Word cannot read it now either. Not the best advert for Libre Office.

So thanks very much for that. As the deadline is today I doubt I going to be able to do it now.
Comment 1 Leigh Bowden 2015-07-03 10:42:02 UTC
Created attachment 117018 [details]
This is the original downloaded document.
Comment 2 Leigh Bowden 2015-07-03 11:06:36 UTC
Created attachment 117019 [details]
Wow. It's done it again with a newly downloaded docx.
Comment 3 tommy27 2015-07-03 12:03:02 UTC
I have no time right now to fully test you test file so I cannot tell which thing triggers the DOCX filesave corruption.

in the meantime if you need to send that application try this:
try saving the original .docx as .odt
then edit the resultant .odt
when you are over, save it and export it to .pdf so you can send it to the recipient even if he/she hasn't LibreOffice.

also consider that you are using LibO 4.4.2.2 which is not up to date.
LibO 4.4.4.3 is out now and there are more than 100 bugfix in it, so it's always better to run the latest version of a LibO branch.

to further debug your issue if it persists after upgrade try editing the .DOCX one page at once.

when you are over with page 1, save it, close it and reload it to see if you have data loss or formatting corruptions.

if it's OK with page 1, do the same for page 2 and so long in order to identify which is the page or the exact editing sequence that cause the problem
Comment 4 tommy27 2015-07-03 12:10:52 UTC
as far as I see there's page loss and page orientation issues in your edited .DOCX files

so I give a better summary description for that
Comment 5 Buovjaga 2015-07-04 11:07:31 UTC
With 4.4.4 and 5.1: saving as docx makes page breaks go away and landscape orientations reset to portrait. Page count changes from 6 to 5.

Win 7 Pro 64-bit, Version: 4.4.4.3
Build ID: 2c39ebcf046445232b798108aa8a7e7d89552ea8
Locale: fi_FI

Version: 5.1.0.0.alpha1+ (x64)
Build ID: 8b788891796ff0571f779cdbe8ce809c35c42754
TinderBox: Win-x86_64@62-TDF, Branch:MASTER, Time: 2015-07-02_23:09:27
Locale: fi-FI (fi_FI)
Comment 6 Timur 2015-07-06 16:02:18 UTC
Created attachment 117086 [details]
Original PDF document from MS Office- 6 pages: 3 portrait, 3 landscape

Problem with page break between page 1 and 2 not saved in DOCX can be reproduced at least from 4.2.0.4. Wasn't there in 4.2.0 beta so regression. 
There's also page break between page 4 and 5.
 
BTW, there was another page break fileopen problem with this file but it's OK from 4.3.
Comment 7 Joel Madero 2015-07-18 05:39:10 UTC
It's incredibly frustrating wen there are two bugs in a single bug report...I just spend almost an hour trying to bibisect this issue because I saw the landscape/portrait issue and not the secondary issue which is the page break which is the actual regression...I refuse to do it again.

This bug should be separated into two issues and marked properly - it seems to me like there is ONE regression, and the other issue (page orientation) is not a regression....

/me super pissed that I just wasted an hour because this bug was incorrectly marked....
Comment 8 Buovjaga 2015-07-19 19:32:55 UTC
Ok let's keep this for the regression (page break loss).
Bug 92827 is for the orientation issue.
Comment 9 Matthew Francis 2015-08-12 07:57:49 UTC
The missing page breaks issue goes back quite a way. In bibisect 43all:

# first bad commit: [0b37d1fca3cd8cd23454122064b8148d343e8903] source-hash-6263315825e01e766668b9ce5d2eb52e71e051a7
Comment 10 Matthew Francis 2015-08-12 09:13:59 UTC
The specific commit at which a round trip starts to lose the page breaks (and also the landscape formatting) appears to be the below.

Adding Cc: to vmiklos@collabora.co.uk; Could you possibly take a look at this one? Thanks

commit b696600821d8aafb63b6a88016d299ef89478f56
Author: Miklos Vajna <vmiklos@suse.cz>
Date:   Mon Jun 25 15:58:11 2012 +0200

    n#766481 dmapper: don't import fake paragraph containing sectpr only, take two
    
    Change-Id: I4623dfd05498b5ba8de73b7e301eaf486f667738
Comment 11 Miklos Vajna 2015-09-18 21:02:35 UTC
I'm afraid this is not a real regression, the root cause is that there is a paragraph with the "page break before" property right after a table, and export of this to docx is not implemented.

The above commit looks correct to me, it just made a missing feature more visible. Nevertheless if I'm at it, I'll take care of this. ;-)
Comment 12 Commit Notification 2015-09-21 05:34:10 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c916152d8562cab868d4c522748ac30029fad179

tdf#92521 DOCX export: handle section break right after a table

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2015-09-23 07:18:50 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=3660e3c87d7aebb3bcec75268f8be0742680c4f6&h=libreoffice-5-0

tdf#92521 DOCX export: handle section break right after a table

It will be available in 5.0.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2015-09-24 05:42:19 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0d7d6f242ef87c976095d22a7f5ebf751ba77ad8

Related: tdf#92521 RTF export: handle section break right after a table

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2015-10-01 09:23:49 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f624559bbfa2cfb8cc2a081174be2ac6b0ed9dcb&h=libreoffice-4-4

tdf#92521 DOCX export: handle section break right after a table

It will be available in 4.4.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 Robinson Tryon (qubit) 2015-12-17 09:17:36 UTC Comment hidden (obsolete)