Bug 112352 - FILESAVE DOCX: Document with one column section appears with two columns in LO
Summary: FILESAVE DOCX: Document with one column section appears with two columns in LO
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.4.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Justin L
URL:
Whiteboard: target:6.0.0 target:5.4.4
Keywords: bibisected, bisected, filter:docx, regression
Depends on:
Blocks: DOCX
  Show dependency treegraph
 
Reported: 2017-09-12 14:43 UTC by Gabor Kelemen (allotropia)
Modified: 2020-11-08 09:49 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
The problematic document (24.96 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2017-09-12 14:43 UTC, Gabor Kelemen (allotropia)
Details
Same document resaved as docx from LO 5.4 (21.75 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2017-09-12 14:43 UTC, Gabor Kelemen (allotropia)
Details
Screenshot of the original document and the resaved document in LO 5.4 (187.07 KB, image/png)
2017-09-12 14:44 UTC, Gabor Kelemen (allotropia)
Details
The resaved document in Word 2013 (80.54 KB, image/png)
2017-09-12 14:45 UTC, Gabor Kelemen (allotropia)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gabor Kelemen (allotropia) 2017-09-12 14:43:06 UTC
Created attachment 136201 [details]
The problematic document

This DOCX document was originally created in Microsoft Office. The document begins with a short two column text, then continues with a one column text on the same page. If the document is saved as DOCX in LibreOffice, the whole text turns into two columns. The issue doesn't appear if the document is saved as ODT.

Steps to reproduce:
1. Open the original document (columns-bug-original.docx) in LibreOffice.
2. Save the document as DOCX in LibreOffice Writer.
3, Open the saved DOCX in LO and Word 2013 as well

Actual results:
The whole document has two columns of text when opened in Writer. It has one column in Word.

Expected results:
The original layout of the document should be preserved.
Comment 1 Gabor Kelemen (allotropia) 2017-09-12 14:43:37 UTC
Created attachment 136202 [details]
Same document resaved as docx from LO 5.4
Comment 2 Gabor Kelemen (allotropia) 2017-09-12 14:44:12 UTC
Created attachment 136203 [details]
Screenshot of the original document and the resaved document in LO 5.4
Comment 3 Gabor Kelemen (allotropia) 2017-09-12 14:45:02 UTC
Created attachment 136204 [details]
The resaved document in Word 2013

This looks rather good, compared to LO.
Comment 4 Dieter 2017-09-12 16:41:33 UTC
I also get the results you can see in the screenshot (comment 2)
Comment 5 Justin L 2017-11-06 18:02:20 UTC
History of how LO opens this file:
-LO3.5 - 4.1 - single column only
-LO4.2 - 5.3 - looks perfect

broken in 5.4 for bug 103931 in commit 4605bd46984125a99b0e993b71efa6edb411699f by author Justin Luth on 2017-03-13 11:23:53 (GMT)
   tdf#103931 writerfilter breaktype: same for implicit and explicit
Comment 6 Justin L 2017-11-06 18:06:07 UTC
I have a problem with the description. It doesn't require a resave to see the problem.  LO5.4 doesn't open up the document in the first place.
Comment 7 Gabor Kelemen (allotropia) 2017-11-07 10:35:09 UTC
Hi Justin

Thanks for taking a look at this. A little update on the mess about things we see in different versions:
- In 5.3.3, it opens correctly, but the exported file appears bad when reopened
- In 5.4.2, it opens bad, and the exported file appears bad as well.

So it got a bit worse in 5.4, but it was not perfect in 5.3 either.
Comment 8 Justin L 2017-11-08 19:23:32 UTC
proposed fix: https://gerrit.libreoffice.org/44505

This has always been a nasty section of code. Doing a fairly safe fix here, and considering trying more radical stuff for 6.1's long testing period. I don't get the unit test failures anymore that my comments indicate happened earlier - that really puzzles me. Unit tests pass with only bIsFirstSection as an added qualifier!!

In playing around, I also found that a nextPage break without a CR also is treated as continuous in Word 2003 anyway. So attempting to create edge-case documents verified in various versions of Word would be a good exercise for me to do in preparation for 6.1.
Comment 9 Commit Notification 2017-11-09 04:41:20 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=afc96d263959d10e457b54a574f0829d20e99df4

tdf#112352 ooxmlimport: ALWAYS treat 1st nextpage w/cols as cont

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2017-11-10 14:53:36 UTC
Justin Luth committed a patch related to this issue.
It has been pushed to "libreoffice-5-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=dab3722d7a09ad500f369146f132c38d464ab0dd&h=libreoffice-5-4

tdf#112352 ooxmlimport: ALWAYS treat 1st nextpage w/cols as cont

It will be available in 5.4.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Justin L 2017-11-10 15:14:12 UTC
Gabor: the patch that broke everything was what generally fixes the round-tripping problem. I see that you've submitted patches before, so it should be easy for you to verify this fix.  Thanks for reporting the bug.
Comment 12 Dieter 2019-09-23 10:36:04 UTC
VERIFIED with

Version: 6.3.2.2 (x64)
Build-ID: 98b30e735bda24bc04ab42594c85f7fd8be07b9c
CPU-Threads: 4; BS: Windows 10.0; UI-Render: GL; VCL: win; 
Gebietsschema: de-DE (de_DE); UI-Sprache: de-DE
Calc: threaded