Bug 138102 - FILEOPEN DOCX Continous section break imported with extra paragraph if the first document node is table
Summary: FILEOPEN DOCX Continous section break imported with extra paragraph if the fi...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: low minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, filter:docx, regression
Depends on:
Blocks: DOCX-Section
  Show dependency treegraph
 
Reported: 2020-11-09 18:16 UTC by NISZ LibreOffice Team
Modified: 2022-05-25 11:20 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Example file from Word beginning with table (35.74 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-11-09 18:16 UTC, NISZ LibreOffice Team
Details
Screenshot of the original document side by side in Word and Writer (73.04 KB, image/png)
2020-11-09 18:16 UTC, NISZ LibreOffice Team
Details
Example file from Word beginning with paragraph (35.87 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-11-09 18:17 UTC, NISZ LibreOffice Team
Details
Screenshot of the document beginning with paragraph side by side in Word and Writer - good (64.74 KB, image/png)
2020-11-09 18:17 UTC, NISZ LibreOffice Team
Details
Example file from Word without table (35.50 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-11-09 18:18 UTC, NISZ LibreOffice Team
Details
Screenshot of the document having no table side by side in Word and Writer - good (67.76 KB, image/png)
2020-11-09 18:18 UTC, NISZ LibreOffice Team
Details
sectionBreak.docx: should be only 2 pages (4.51 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2020-11-28 07:50 UTC, Justin L
Details
Another example file (44.38 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-05-17 13:58 UTC, NISZ LibreOffice Team
Details
attachment 172087 in Word and Writer (87.06 KB, image/png)
2021-05-17 13:59 UTC, NISZ LibreOffice Team
Details

Note You need to log in before you can comment on or make changes to this bug.
Description NISZ LibreOffice Team 2020-11-09 18:16:29 UTC
Created attachment 167152 [details]
Example file from Word beginning with table

Attached user made document starts with a table then it contains a continous section break and a two column section, then another continous section break.
When opened in Writer the first continous section break is generating an extra empty paragraph, which in the original document was enough to change the documents layout significantly.
Also if the document is not starting with a table this does not happen.

Steps to reproduce:
    1. Open attached document

Actual results:
Extra empty paragraph after the one containing “F” and before the two column section.

Expected results:
No extra empty paragraph.

LibreOffice details:
Version: 7.1.0.0.alpha1+ (x64)
Build ID: 00e5c63c35307faacf76a5e2ca7953c4208244ed
CPU threads: 4; OS: Windows 6.3 Build 9600; UI render: Skia/Raster; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: CL

Also in:
Version: 6.0.0.3
Build ID: 64a0f66915f38c6217de274f0aa8e15618924765
CPU threads: 4; OS: Windows 6.3; UI render: default; 
Locale: en-US (hu_HU); Calc: CL

Version: 5.0.0.5
Build ID: 1b1a90865e348b492231e1c451437d7a15bb262b
Locale: hu-HU (hu_HU)

Version: 4.3.0.4
Build ID: 62ad5818884a2fc2e5780dd45466868d41009ec0

Not yet in:
Version: 4.2.0.4
Build ID: 05dceb5d363845f2cf968344d7adab8dcfb2ba71

Additional Information: 

Bibisected using bibisect-win32-4.3 to:
URL: https://cgit.freedesktop.org/libreoffice/core/commit/?id=2e8aad6d45c53d554ccaf26de998ede708cfc289 

Author: Vinaya Mandke <vinaya.mandke@synerzip.com>
Date:   Fri Apr 18 15:50:51 2014 +0530

    fdo#39056 fdo#75431 Section Properties if section starts with table
Comment 1 NISZ LibreOffice Team 2020-11-09 18:16:57 UTC
Created attachment 167153 [details]
Screenshot of the original document side by side in Word and Writer
Comment 2 NISZ LibreOffice Team 2020-11-09 18:17:16 UTC
Created attachment 167154 [details]
Example file from Word beginning with paragraph
Comment 3 NISZ LibreOffice Team 2020-11-09 18:17:37 UTC
Created attachment 167155 [details]
Screenshot of the document beginning with paragraph side by side in Word and Writer - good
Comment 4 NISZ LibreOffice Team 2020-11-09 18:18:36 UTC
Created attachment 167156 [details]
Example file from Word without table
Comment 5 NISZ LibreOffice Team 2020-11-09 18:18:50 UTC
Created attachment 167157 [details]
Screenshot of the document having no table side by side in Word and Writer - good
Comment 6 Dieter 2020-11-24 15:57:27 UTC
I confirm it with

Version: 7.0.3.1 (x64)
Build ID: d7547858d014d4cf69878db179d326fc3483e082
CPU threads: 4; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: de-DE (de_DE); UI: en-GB
Calc: CL

in comparison with Word 2016
Comment 7 Justin L 2020-11-28 07:50:05 UTC
Created attachment 167634 [details]
sectionBreak.docx: should be only 2 pages

This is obviously a bRemove issue.
I thought it would be as simple as
- && !m_pImpl->GetIsDummyParaAddedForTableInSection()
+ && !(m_pImpl->GetIsDummyParaAddedForTableInSection() && m_pImpl->GetIsFirstParagraphInSection())

But it's not.
Comment 8 NISZ LibreOffice Team 2021-05-17 13:58:17 UTC
Created attachment 172087 [details]
Another example file

I found one more of these baddies.

Here the section break is page break type (converted to simple page break), and the second page begins with a table, causing one extra paragraph to appear below the second table.

Likely the same bug, but uploading anyways just in case.
Comment 9 NISZ LibreOffice Team 2021-05-17 13:59:01 UTC
Created attachment 172088 [details]
attachment 172087 [details] in Word and Writer

Version: 7.2.0.0.alpha1+ (x64) / LibreOffice Community
Build ID: 91330c503b7eb91d777978018b66890af87cf8f5
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: default; VCL: win
Locale: hu-HU (hu_HU); UI: en-US
Calc: CL
Comment 10 Justin L 2022-05-25 11:20:32 UTC
(In reply to Justin L from comment #7)
The attachment in comment 7 shows what happens if you make a bad fix. It does not demonstrate OPs problem.

To fix this properly will require an IsFirstRealParagraphInSection flag. This would be similar to IsFirstParagraphInSection, but table cells turn IsFirstParagraphInSection off - and other code relies on this capability.

Essentially, there needs to be one real paragraph in the section. AFAICS, bRemove should ignore GetIsDummyParaAddedForTableInSectionPage unless it is the first real paragraph in the section.

Sounds easy, but with redlines, headers, footnotes, comments, shapes, tables etc it gets rather complicated... Is it really worth the hassle to address this issue? It is a rather obscure edge case with a rather minimal impact.


And yes, comment 8 has another example of this bug.