Bug 61423 - FILESAVE: wrong export of specific ODT with table to DOC - section break next page after each row
Summary: FILESAVE: wrong export of specific ODT with table to DOC - section break next...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: low minor
Assignee: Not Assigned
URL:
Whiteboard: BSA interoperability
Keywords: filter:doc, filter:rtf
: 68710 (view as bug list)
Depends on:
Blocks: DOC-Tables
  Show dependency treegraph
 
Reported: 2013-02-25 05:06 UTC by JohnDoe_71Rus
Modified: 2023-04-19 11:42 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
original odt (12.66 KB, application/vnd.oasis.opendocument.text)
2013-02-25 05:06 UTC, JohnDoe_71Rus
Details
save as .doc (12.50 KB, application/msword)
2013-02-25 05:07 UTC, JohnDoe_71Rus
Details
save as .rtf (14.93 KB, application/rtf)
2013-02-25 05:20 UTC, JohnDoe_71Rus
Details
original tested in LO 6.4+ and MSO 2016 (134.71 KB, image/png)
2019-10-11 07:08 UTC, Timur
Details
tdf61423_pageStyleInTable.patch (4.42 KB, patch)
2020-05-08 13:37 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description JohnDoe_71Rus 2013-02-25 05:06:24 UTC
Created attachment 75471 [details]
original odt

Problem description: 

wrong export file with table to rtf and doc

Steps to reproduce:
1. Make file with table
2. Save As *.rtf or Microsoft Word 97/2000/XP/2003 (.doc)


Current behavior:
New row from table on new list of document

Expected behavior:

              
Operating System: Windows XP
Version: 4.0.0.3 release
Comment 1 JohnDoe_71Rus 2013-02-25 05:07:53 UTC
Created attachment 75472 [details]
save as .doc
Comment 2 JohnDoe_71Rus 2013-02-25 05:20:19 UTC Comment hidden (obsolete)
Comment 3 Joel Madero 2013-02-25 05:41:28 UTC
Verifed on 3.6.4.3
Bodhi Linux 2.2

Changing version:

@John Doe - version is the oldest version we can confirm the issue

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

New (confirmed)
Major (data corruption)
High (default, this may be highest but so far only one test document shows error)

@John Doe - can you reproduce this repetitively or does it only occur with this document?
Comment 4 JohnDoe_71Rus 2013-02-25 06:03:34 UTC
Can't reproduce this repetitively. Seems like it only this file.
Comment 5 Joel Madero 2013-02-25 06:09:49 UTC Comment hidden (obsolete)
Comment 6 Alexandr 2014-11-21 19:13:20 UTC
Reproducible with LibreOffice 3.5.4, 4.3.3 and 4.4.0.0-beta1 on Debian.
Comment 7 Xisco Faulí 2015-09-11 12:33:49 UTC
This issue is still present in

Version: 5.0.1.2
Build ID: 81898c9f5c0d43f3473ba111d7b351050be20261
Locale: es-ES (es_ES)

on Windows 7 (64-bit)

when exporting as .doc but not when exporting as .docx
Comment 8 Robinson Tryon (qubit) 2015-12-14 06:00:50 UTC Comment hidden (obsolete)
Comment 9 QA Administrators 2018-10-02 02:56:03 UTC Comment hidden (obsolete)
Comment 10 Timur 2018-10-04 16:37:40 UTC
This was reported for both RTF and DOC. RTF is fine. But DOC repro with 6.2+.
Not sure this should be kept open at all because it's thi specific document. 
But we should figure out the reason.
Comment 11 Justin L 2018-10-06 20:04:12 UTC
1.) This is an export bug - I see the same one-row-per-page in Word, due to a section break - next page.
2.) somehow coming from a RES_PAGEDESC in every single cell in ww8atr.cxx

if ( SfxItemState::SET == pSet->GetItemState( RES_PAGEDESC, false, &pItem ) &&
     static_cast<const SwFormatPageDesc*>(pItem)->GetRegisteredIn() != nullptr)
{
    bBreakSet = true;
    bNewPageDesc = true;
    pPgDesc = static_cast<const SwFormatPageDesc*>(pItem);
    m_pCurrentPageDesc = pPgDesc->GetPageDesc();
}

3.) which ultimately comes from wrtww8.cxx WriteText's SectionBreaksAndFrames
Comment 12 QA Administrators 2019-10-11 02:37:40 UTC Comment hidden (obsolete)
Comment 13 Timur 2019-10-11 07:08:31 UTC
Created attachment 154920 [details]
original tested in LO 6.4+ and MSO 2016

Reproduced for DOC in LO 6.4+.
Comment 14 Justin L 2020-04-17 13:22:13 UTC
*** Bug 68710 has been marked as a duplicate of this bug. ***
Comment 15 Justin L 2020-05-08 13:09:48 UTC
Potential fix is found at https://gerrit.libreoffice.org/c/core/+/93729.

However, it seems to be at odds with what MS is capable of doing. MS allows the first column, first paragraph to specify a "page break before" that it will honour. See bug 48097.

What I just did here breaks that.  Perhaps this needs to move into ww8atr.cxx OutputSectionBreaks and only affect a Section break, not a page break.
Comment 16 Justin L 2020-05-08 13:37:42 UTC
Created attachment 160536 [details]
tdf61423_pageStyleInTable.patch
Comment 17 Justin L 2020-07-23 12:10:07 UTC
In users.odt, we see all cells in the first column in the table defined like
<text:p text:style-name="P15">
  <text:span text:style-name="T4">111111</text:span>
</text:p>

and the dynamic style P15 defined with
<style:style style:name="P15" style:family="paragraph" 
             style:master-page-name="Standard">
             .../>
</style:style>

where master-page-name == RES_PAGEDESC - aka a break-before-with-page-style.

You can't see this in the UI - probably because it is illegal.
Again - not touching this as per comment 15.
Comment 18 Xisco Faulí 2021-04-13 10:35:26 UTC
Still reproducible in

Version: 7.2.0.0.alpha0+ / LibreOffice Community
Build ID: 8043fe3e45c8999c8eaf475ba46d50b125e38b93
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 19 QA Administrators 2023-04-14 03:25:33 UTC Comment hidden (obsolete, spam)
Comment 20 ThomasBourchier 2023-04-19 07:40:20 UTC
Tested on version 7.4.6.2 and all now looks good. Closing.
Comment 21 Timur 2023-04-19 09:30:54 UTC
Thanks for testing, really OK.
If a fixing commit is not known, it's not marked FIXED but WFM.
Comment 22 Justin L 2023-04-19 11:33:34 UTC
I'm not sure how it could have been fixed in 7.4.6. I didn't see it fixed in Linux bibisect 7.4 - only in 7.5. The fix was not backported to 7.4 either.

LO 7.5 commit c37f62b71fa59917ef85ff98480dff18aa936e41
Author: Justin Luth on Wed Jul 20 13:03:13 2022 -0400
    tdf#145998 sw ms export: use page break, not section break
    
    If possible, use a simple page break instead of a section break.
    
    Eliminate unnecessary "page style" changes. If the page will
    become that style anyway, then a simple page break will suffice.
    
    The benefit is primarily for LO import, since it is virtually
    impossible on import to know if a section is identical
    to the previous section. Thus we have previously multiplied
    page styles - often redundantly.
    
    This also starts to fix a real problem with first headers showing up
    on an unnecessary new page style. Unit test deals with this.
    
    Change-Id: Ib9e24bbd579b29aa21efb2b85750ecfcb8c7e5cb
Comment 23 Justin L 2023-04-19 11:42:31 UTC
I assume ThomasBourchier missed comment 10 and was testing with RTF export which has already worked since at least 6.2. The bug report was limited to DOC export, and that was fixed in 7.5.

Also confirmed that MS Word 2010 opens it correctly.