Created attachment 161529 [details] Comparison MSO 2010 and LibreOffice 7.0 master Steps to reproduce: 1. Open attached document ( either the DOC or the DOCX document ) -> First line breaks in the middle. it should reach the end of the paragraph. See comparison image. Reproduced in Version: 7.0.0.0.alpha1+ Build ID: 82894d85147840f1f587e9530b12f0058f2ef2c3 CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Calc: threaded [Bug found by office-interoperability-tools]
Created attachment 161530 [details] DOCX file
Created attachment 161531 [details] DOC file
I've bisected it with bibisect-linux64-6.0 and it points to author Eike Rathke <erack@redhat.com> 2017-11-17 11:03:45 +0100 committer Eike Rathke <erack@redhat.com> 2017-11-20 19:28:10 +0100 commit 9206a08ada00e8762c4a634f242bd566028964bb (patch) tree eaa317ce6717d44f75c077a6db147b0ebd4994b7 parent a8687041c46b3fe93a76faa0a4a65e7069ef5e9d (diff) Upgrade to ICU 60.1 so it might be Writer interprets a unicode as a line break? @Justin, I thought you might be interested in this issue...
It is not being read in as a line break. (There is no linebreak character indicated with reveal formatting.) Add more spaces, and it will jump back up to the top line.
Created attachment 161544 [details] Untitled 1234b.odt: copy/paste of the text into new ODT to (assumedly) avoid compat flags. I don't think this is related to MS formats.
I confirm ith with Version: 7.0.0.0.beta1 (x64) Build ID: 94f789cbb33335b4a511c319542c7bdc31ff3b3c CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win Locale: de-DE (de_DE); UI: en-GB Calc: CL and Word 2016
Created attachment 163441 [details] semicolonedNonBreakingWhitespace_133607.odt: cleanroom demonstration This seems somehow to be related specifically to the semi-colons (discovered through trial and error). Apparently they have a special meaning when they follow whitespace. Reproducable steps. 1.) type any sentence in Writer just one word longer than one line, so that it wraps to the next line.. 2.) starting from the last word, add a semi-colon in front of it. Notice that the previous word is now added in front. 3.) repeat. If you DELETE a semi-colon, the text will not re-flow backwards, but if you save/re-open, then the text will re-flow backwards. This is probably intentional behaviour. I'd guess that if it is not intentional, then it is an ICU bug and NOTOURBUG. @Eike might be able to provide more knowledgeable insight.
Tested after yesterday's author Eike Rathke on 2020-11-17 16:33:33 +0100 commit 8335c8c20765d4f167d9b48e6a2757864a3bc7fd Update to ICU 68.1 and still the same thing. A space followed by a semi-colon is treated as a keep-with-next-work flag.
repro 7.3+ with new ICU 70.1.