Bug 126994 - When I make any modification in a .docx file from Microsoft Office 2013, the page break is lost
Summary: When I make any modification in a .docx file from Microsoft Office 2013, the ...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.2 all versions
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:6.4.0 target:6.3.4
Keywords:
Depends on:
Blocks: DOCX-TableofContents
  Show dependency treegraph
 
Reported: 2019-08-17 17:43 UTC by hmslima1992
Modified: 2019-11-07 20:23 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Files used in the test. Pay attention to the file names, they are descriptions (2.95 MB, application/zip)
2019-08-19 14:14 UTC, hmslima1992
Details
Files used in the test. Pay attention to the file names, they are descriptions (35.04 KB, application/zip)
2019-08-23 16:23 UTC, hmslima1992
Details

Note You need to log in before you can comment on or make changes to this bug.
Description hmslima1992 2019-08-17 17:43:54 UTC
Page break is lost every single time I make any edition in the .docx file, except when I re-add the page break by LibreOffice itself, but even when I add a new page break (through LibreOffice or Microsoft Office), the page break will be lost when I edit the document with LibreOffice again.

I've only made this test with documents from Microsoft Office 2013, but I suppose that the result will be the same with .docx documents made from other version of this office suite.
Comment 1 Dieter 2019-08-19 07:55:58 UTC
Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Please note that the attachment will be public, remove any sensitive information before attaching it)
Comment 2 hmslima1992 2019-08-19 14:14:44 UTC
Created attachment 153508 [details]
Files used in the test. Pay attention to the file names, they are descriptions

Check the file "DOCX created by Microsoft Office 2013 and edited by LibreOffice 6.3 - I only updated the summary here)", in this one the page break disappears.

The document "DOCX created by LibreOffice 6.3 (fresh document)" is the one just created by LibreOffice with no modification

The document "DOCX created by Microsoft Office 2013 and edited by LibreOffice 6.3 - here I corrected the summary and page break by LibreOffice)" works well in Microsoft Office.
Comment 3 Dieter 2019-08-19 16:18:48 UTC
Thank you for the test documents. In every document I can see one page break (between second and third page).

I opened fresh document and updated index but page breake is still there.

So please give more detailed steps how to reproduce the disappearance of the page break.


I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the steps are provided.
Comment 4 hmslima1992 2019-08-19 21:45:22 UTC
The page break doesn't appear when opened by Microsoft Office (I've tested this in the versions 2010 and 2013), so you'll need some version of Microsoft Office (preferably the ones used in this test...) to check it. Forgive me for not having said that before.

But know that this problem only specifically occurs with the file "DOCX created by Microsoft Office 2013 and edited by LibreOffice 6.3 - I only updated the summary here)", the other two files are fine, I can se the page break in them when I open them with Microsoft Office.

.

Unfortunatelly I am only with my very old laptop right now (I am in other town), but when I go back to home (at Friday...) I will describe my steps with more precision and I will provide you more clean files for you to analyze. That's why I didn't change the status of this bug report, but maybe you can see the problem by the new information (that you have to open the file with Microsoft Office, no with LibreOffice...).
Comment 5 hmslima1992 2019-08-23 16:23:35 UTC
Created attachment 153599 [details]
Files used in the test. Pay attention to the file names, they are descriptions

This time I provide a step-by-step explanation and new files.

Apparently the page break is only lost in documents created by Microsoft Office and edited by LibreOffice, the .docx documents created by LibreOffice have not this issue, but I can test it later.

It's necessary some version of Microsoft Office installed in your machine (or virtual machine...) to check the files, preferably the version 2013, but you can do the test with the version 2010 and maybe with the version 2016.

.

HOW I MADE THE TEST:

1 - I created a docx document with Microsoft Office Word 2013, this document has a very simple "cover" (just the "document's title" and "author's name"), a summary (the title of the summary is in Portuguese, sorry for that) and the titles throughout the document that are necessary to generate the automatic summary.

2 - I saved this document with the name "DOCX created by Microsoft Office 2013 - Summary and PageBreak (fresh document)", lets call it just as "Fresh Document". This file "Fresh Document" is just for control, just for you seeing how is the document in its natural state!

3 - I made a copy of this document (Ctrl C - Ctrl V) and the new one was renamed to "DOCX created by Microsoft Office 2013 - Summary and PageBreak (updated by LibreOffice 6.3)", lets call it just as "DOCX for LibreOffice 6.3".

4 - I opened the file "DOCX for LibreOffice 6.3" with LibreOffice 6.3, I updated the summary of this document (I did no more modifications beyond that), I saved this document and then I closed it. Until before I close the document, everything looked fine.

5 - I reopened the file "DOCX for LibreOffice 6.3" with Microsoft Office 2013 and the page break was no more there.

5.1 - I have to say that the text formatting of the summary's title was lost, but I already made a bug report for this problem in specific...

5.2 - I also reopened the file "DOCX for LibreOffice 6.3" with LibreOffice 6.3 and the page break was no more there. Same result found with Microsoft Office 2013.

6 - I made a copy of the file "DOCX for LibreOffice 6.3", which I renamed as "DOCX created by Microsoft Office 2013 - Summary and PageBreak (updated and corrected by LibreOffice 6.3)", lets call it just as "corrected DOCX"

7 - I reopened the file "corrected DOCX" with LibreOffice 6.3 and I made the necessary corrections (recreation of the page break, the pagination after the page break and also the reformatting of the summary's title)

8 - I reopened the file "corrected DOCX" with Microsoft Office 2013 and I could confirm that LibreOffice was able to correct everything. Everything is fine here.

8.1 - But the point is that I shouldn't need to do this. If a co-worker (which surely uses Microsoft Office) sends me a .docx document, LibreOffice will mess with the page break (and summary's title...) of his document. OK, I can (re-open the document and) correct the page break (and the summary's title...) with LibreOffice itself, but I shouldn't need to do this extra step every time I touch this document with LibreOffice.

9 - But if I make any modification in the file "corrected DOCX", I will lost the page break again


.
.
.
.


MY “THEORY”

Since I am not a programmer, I may be wrong, but I think that this problem is caused because of the way that ODT and DOCX files separate two or more set of pages with different styles. As far as I know, there is no style for pages of DOCX files, so when you want that the set of pages after the page break have a different style of the previous one (for example, in the files that I sent to you, the second set of pages have pagination while the first one doesn't have it), you have to unmark the option "Link to Previous" (this in Microsoft Office), while for ODT files we define a style for the fist set of pages and create another style for the next set of pages.

An extra observation is with the size of the files. Theoretically the files "Fresh Document" and "DOCX for LibreOffice 6.3" should have the very same size since "DOCX for LibreOffice 6.3" theoretically has zero modifications in relation to "Fresh Document". But look at their sizes:
"Fresh Document" = 16.7 KiB
"DOCX for LibreOffice 6.3" = 11.5 KiB
There was a loss of information!

.
.
.
.


> Thank you for the test documents. In every document I can see one page break (between second and third page).

Strange, I don't have the same result here, are you sure? I am using LibreOffice 6.3.0.4 in Kubuntu 18.04 64bit. The document "DOCX created by Microsoft Office 2013 and edited by LibreOffice 6.3 - I only updated the summary here)" doesn't have a break page for me.
Comment 6 Justin L 2019-09-02 18:30:18 UTC
confirmed in master. Also tested with LO 5.2 and the problem existed already then.

I opened the "fresh document", noted the page break after the Sumário page, round-tripped the file, and saw that the page break is lost even in LO.
Comment 7 Justin L 2019-09-03 11:31:34 UTC
proposed fix at https://gerrit.libreoffice.org/78552 tdf#126994
Comment 8 Commit Notification 2019-09-09 14:56:09 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/5d04e2c94c8b3f6c5e75ff4c394ca086de5a6e5a%5E%21

tdf#126994 ww8 export: Don't skip TOX end node

It will be available in 6.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Xisco Faulí 2019-11-07 16:02:59 UTC
Verified in

Version: 6.4.0.0.alpha1+
Build ID: 498c2d3944b666c2f016b65903001920db2cb2a4
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded

@Justin Luth, thanks for fixing this issue!
Comment 10 Commit Notification 2019-11-07 20:23:53 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-6-3":

https://git.libreoffice.org/core/commit/fbd1914f6dfed752f4aee01302f49cbd0b0cd239

tdf#126994 ww8 export: Don't skip TOX end node

It will be available in 6.3.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.