Created attachment 143454 [details] Example document, reduced from a user doc Attached simplified user document contains a simple 1x1 table. There are some text and a <w:cr/> tag in the cell. When opening it in Writer, the content before the <w:cr/> tag appears top of the table, out of cell. Actual results: The text before <w:cr/> tag appears out of the table in LibreOffice view. Expected results: Whole text appears in the cell. The <w:cr/> tag removed. LibreOffice details: Version: 6.2.0.0.alpha0+ Build ID: bb1d5780226bb1b9156580972eea9aa849178742 CPU threads: 1; OS: Windows 6.1; UI render: default; TinderBox: Win-x86@42, Branch:master, Time: 2018-07-03_05:56:48 Locale: hu-HU (hu_HU); Calc: group threaded
Created attachment 143455 [details] Screenshot of the document in Word
Created attachment 143456 [details] The document in Writer
Created attachment 143459 [details] Another example version of the reduced user doc
Created attachment 143461 [details] The other example in LO 6.2alpha and Word 2013 In a more complicated table the entire table structure disappears, leaving only the cell contents behind. We have no idea how the users managed to create the original document in Word - it contained change tracking entries and comments from multiple organizations as well.
Reproduced in Version: 6.2.0.0.alpha0+ Build ID: c290f692dd28094d41dff686f3faa1c4e14b556e CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); Calc: group threaded Version: 5.2.0.0.alpha0+ Build ID: 3ca42d8d51174010d5e8a32b96e9b4c0b3730a53 Threads 4; Ver: 4.10; Render: default; Version: 4.3.0.0.alpha1+ Build ID: c15927f20d4727c3b8de68497b6949e72f9e6e9e LibreOffice 3.3.0 OOO330m19 (Build:6) tag libreoffice-3.3.0.4
@Laszlo, I think you should be interested in this one.
Proposed fix: https://gerrit.libreoffice.org/#/c/60585/ tdf#118691 DOCX import: fix table loss caused by <w:cr> According to the OOXML standard, <w:cr> (carriage return – Unicode character 000D) is equivalent to a break with null type and clear attributes, so we handle it as a <w:br/>, instead of endOfParagraph, fixing losing table paragraphs and tables containing <w:cr/>. Note: It seems, MSO cannot handle carriage return characters in table cells correctly. It shows squares (unknown characters) without line break there. Copying this text to a non-table paragraph in MSO, we get the correct layout with line breaks. Copying this text with carriage return characters back to a table cell, we get squares again. With this LO fix, it will be possible to fix the bad tables edited by MS Word by using LO, because LibreOffice import/export converts all <w:cr>s to <w:br>s (as before, but now without destroying the structure of the tables).
László Németh committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=f63a60f56156e4ac17887e6c96d15fb865a2a8eb tdf#118691 DOCX import: fix table loss caused by <w:cr> It will be available in 6.2.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
László Németh committed a patch related to this issue. It has been pushed to "libreoffice-6-1": http://cgit.freedesktop.org/libreoffice/core/commit/?id=8693f6fa799c43304741f465c23e827c3ceafd9d&h=libreoffice-6-1 tdf#118691 DOCX import: fix table loss caused by <w:cr> It will be available in 6.1.2. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
*** Bug 116889 has been marked as a duplicate of this bug. ***