Created attachment 137470 [details] The reduced example document I'm not sure how to reproduce the behavior in the attached document (it's a minimized version of a 250 pages company doc made by a far away department), but it opens in a terribly broken way in LO 5.4.2. - Table cells have an overly huge indentation, causing the originally one page table overflow to 9 pages instead of 1. - The paragraphs after the table get a lettered numbering style, with a lot of letters appearing before them. They did not have a numbering style in Word. Also if I remove the page break from the first page, none of these happen. Version: 5.4.2.2 Build ID: 22b09f6418e8c2d508a9eaf86b2399209b0990f4 CPU threads: 4; OS: Windows 6.2; UI render: default; Locale: hu-HU (hu_HU); Calc: group
Created attachment 137471 [details] The table part in Word 2013
Created attachment 137472 [details] The table in LO 5.4
Created attachment 137473 [details] The text part in Word 2013
Created attachment 137474 [details] The text part in LO 5.4
Created attachment 137475 [details] The same file without the page break on the first page
Created attachment 137476 [details] The file without the page break on the first page in LO 5.4
Lokking at your image from comment 2 it looks like, that there is also a numbering in the table itself (I can't see it in image from comment 1). Could this cause the indent?
Reproducible in the most recent daily dbgutil build as today. Version: 6.0.0.0.alpha1+ Build ID: 2416d69ad4fa26b65d5b05a8575ac96af6b2c9a9 CPU threads: 4; OS: Linux 4.13; UI render: default; VCL: gtk3; Locale: zh-CN (zh_CN.UTF-8); Calc: group Set to NEW.
Bisected to LO 4.4 commit 29a91e4739496e1db5995fdece6a3ffa89e42774 Author: Mark Hung. Commit Date: Tue Nov 18 11:24:19 2014 +0100 Fix outline numbering for ooxml import filter. https://cgit.freedesktop.org/libreoffice/core/commit/?id=29a91e4739496e1db5995fdece6a3ffa89e42774 but even though the problem disappears when this is reverted (the WW8 style name is Cmsor4 which didn't match Heading4, but the converted name matches "Heading 4"), it has just uncovered an existing problem. The table problem can also be resolved by setting the infamous bRemove = false. Somehow, the Heading 4 paragraph numbering from the sectPr of the first section's deleted paragraph is being applied to the rest of the document.
(In reply to Justin L from comment #9) > but even though the problem disappears when this is reverted (the WW8 style > name is Cmsor4 which didn't match Heading4, "Heading 4" is "Címsor 4" in Hungarian, which is saved without the accented í character and the space.
Created attachment 144180 [details] numbering.docx: replaced Cmsor with Heading for 2, 3, and 4 (In reply to Gabor Kelemen from comment #10) > "Heading 4" is "Címsor 4" in Hungarian, which is saved without the accented > í character and the space. And this is good proof that Mark's commit is valid because the code logic is dependant on the use of English. Doing a bibisect based on this modified-to-use-english-style-names document, I run into a one-day range of broken builds in LO 4.3. Combining the results of 43all and 43max, I get the 20 commit range of https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=76fe205d7e0fe0a73616453209d8094cab9ce79f..a6e19ea7a8885dcbdb1d8b2f56586373595ba8aa of which this one stands out (although reverting it does not resolve the problem, so likely this one also just exposes an existing problem...) >author Michael Stahl 2014-03-06 00:20:00 +0100 >commit 2b78f2cd7b9e4bab0f3b3b9119238f36a1bbc7b2 >rhbz#988516: DOCX import: fix context stack when importing header/footer > >When a header/footer substream is parsed, a ParagraphGroup is started, >but not ended; so the properties of the last paragraph in the >header/footer are applied to a paragraph in the body. > >The obvious fix to add a call to endParagraphGroup() at the end of w:p >element breaks table cells. So add a call to endParagraphGroup() at >the end of the "hdr"/"ftr" element.> None of this points in a clear direction yet. I had to change Cmsor -> Heading for all three - just doing one at a time didn't work. However, the bug is less dependent on Cmsor2 (since I didn't change that in document.xml). Reference: (note these are bibisect commits, not code commits) first bad 43all bibisect commit: 56a3b3c781fc2eb55f46641d89a866a91119a8a3 first bad 43max bibisect commit: 1dad09c0413728618600e94167f71b6fdda422e3 last good 43max bibisect commit: bd286ce1d9d478895e4a9aacf3469b659917f5ff
(In reply to Justin L from comment #11) >rhbz#988516: DOCX import: fix context stack when importing header/footer This is the same commit that caused the similar problem fixed in bug 97417 Here is what is happening. 1.) Heading 4 is assigned as the paragraph style for the sectPr empty paragraph, which will be bRemove'd. bRemove tries to delete PROP_NUMBERING_RULES to prevent run-on numbering (bug 97417), but there is nothing to delete. 2.) finish paragraph finds no explicit NUMBERING_RULES, so it looks to the style, and Heading 4 of course is a numbering style in LO. So, now the NUMBERING_RULE property that bRemove tried to avoid has been secretly added.
Created attachment 144201 [details] tdf113608_runAwayNumbering.docx: minimized unit test
@Justin: I pushed a possible fix for this issue: https://gerrit.libreoffice.org/#/c/59117/, I will add your minimized unit test to it tomorrow, if it's ok for you. Thanks, László
László Németh committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=4b432a8c77ae4b5e8305ca69953bc202e7614664 tdf#113608 DOCX import: don't use numbering of removed paragraphs It will be available in 6.2.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Verified in Version: 6.2.0.0.alpha0+ Build ID: 401cba4c20fbc930f034168872642428d7459218 CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; Locale: ca-ES (ca_ES.UTF-8); Calc: threaded @László Németh, Thanks for fixing this!!
@Justin, Xisco: Thanks for your help and feedback!