Bug 123285 - DATA LOSS: Writer AutoFormat "Delete spaces and tabs at end and start of line" dropping characters from text with built-in paragraph styles
Summary: DATA LOSS: Writer AutoFormat "Delete spaces and tabs at end and start of line...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.2.0.3 release
Hardware: All All
: high major
Assignee: Michael Stahl (CIB)
URL:
Whiteboard: target:6.3.0 target:6.2.4
Keywords: bibisected, bisected, regression
Depends on:
Blocks: AutoCorrect-Complete
  Show dependency treegraph
 
Reported: 2019-02-09 10:39 UTC by John L. ten Wolde
Modified: 2019-05-09 06:17 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Before and After screenshots of the breakage annotated using Draw. (240.50 KB, image/png)
2019-02-09 10:39 UTC, John L. ten Wolde
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John L. ten Wolde 2019-02-09 10:39:51 UTC
Created attachment 149039 [details]
Before and After screenshots of the breakage annotated using Draw.

Hi all.  I upgraded from 6.1.4.2 to 6.2.0.3 last evening and am now seeing some truly odd behavior.  My install is from the Linux x86_64 RPM batch but I very much doubt that the issue I'm experiencing is OS specific.


BACKGROUND:

My current use case involves manuscripts and screenplays in traditional "old-school" typewriter format, thus mandating 10pitch mono-space fonts with two spaces between sentences and after colons, etc.  The key point here is that I always keep the AutoCorrect Options' "Ignore double spaces" setting unchecked.


PROBLEM:

Since the move to 6.2.0.3, I've been noticing that if a sentence division within a paragraph breaks between two lines (i.e. one line ends with a <PERIOD><SPACE><SPACE> combination and the following sentence is pushed to the next line) my upcoming Carriage Return will cause the two spaces at the division to be reduced to one space and all characters at the beginning of all subsequent lines to be themselves replaced by spaces.  The paragraph will then reflow with the text mangled.  Confusing right?  Hopefully the attached screenshots will clarify what I'm describing.


EXPERIMENTATION:  

I haven't been able to determine what's causing it, but I have discovered that the issue only plagues the most fundamental Paragraph Styles bundled with Writer.  I've so far confirmed the problem to affect the "Caption", "Heading", "First Line Indent", "Hanging Indent", and "Text Body Indent" styles.  Others will probably manifest the issue as well, but I haven't tested them all.  "Default Style" and "Text Body" are unaffected!

Based on the above, I wondered if paragraph indentation was causing the problem.  Nope.  I fiddled with the properties (indentation, linespacing, etc.) of the various out-of-the-box styles and could neither induce the bug in those unaffected, nor prevent it from occurring in those that were.  I created a number of custom styles at various points along the style tree and none of them displayed the problem.  This was weirdly true even if I "cloned" a malfunctioning one.  For example, I created an unmodified child of "First Line Indent" and it did *not* inherit the issue from its parent.


STEPS TO REPRODUCE:

1.  Open a new blank document.  Some steps below assume a new instance of the Default (A4) template.  You'll likely also want to Toggle Formatting Marks (CTRL+F10) to visible.

2.  Confirm that Tools -> AutoCorrect Options... -> Options Tab -> "Ignore double spaces" is unchecked.

3.  Populate the document with enough text to produce a paragraph several lines long and ensure that at least one line breaks (that is to say, *ends*) on two space characters.  In my screenshots I used the following:

---
Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.
---


4.  Set the paragraph style to "First Line Indent".  If you copy-pasted my text from Step 3, set the font for all text to Liberation Serif (if it isn't already) and change the font size to 14.  This should cause a two-sentence division to break across the 4th and 5th lines with two space characters at the end of the 4th.  If you didn't use my text, use whatever font and size you please (it doesn't seem to matter) but, again, make sure at least one line ends with two spaces.

5.  Move to the end of the paragraph and press ENTER.


CONCLUSION:

I fully expect the paragraph to have become mangled.  If no one else can confirm or reproduce this (mis)behavior I'll scream and pull my hair out.  Well, no, not really.  Instead I'll revert to 6.1.5.x in the meanwhile but still be very, very annoyed.

Seriously though, I'm tempted to mark this bug as CRITICAL because data loss can occur if a user fails to notice the characters disappearing, saves the document, and then closes it.  If they are lost, putting all those missing characters back one-by-one, after-the-fact, across multiple paragraphs in a large document would be an absolute nightmare!  Note that CTRL+Z *will* undo the damage if it's caught in time while the document is still open.

Thanks in advance to all those who look into this matter, and thanks in general to everyone for all your continued hard work.
Comment 1 John L. ten Wolde 2019-02-10 21:33:59 UTC
Having gone back and (re)tested the old, familiar behavior with 6.1.4.2 and 6.1.5.2, I now see that two (or more) spaces at the end of a line were already being truncated to one in those versions. I likely never noticed this because the corruption of other characters wasn't occurring.

I didn't state an expected behavior in my initial report, but obviously it would be that, when the "Ignore double spaces" setting is unchecked, *all* characters input by the user (recurrent white-space or otherwise) should be preserved, intact, within a paragraph, regardless of their position on a line.
Comment 2 John L. ten Wolde 2019-03-08 21:45:26 UTC
Problem persists in 6.2.1.2.
Comment 3 Dieter Praas 2019-03-10 20:47:05 UTC
I confirm this with

Version: 6.3.0.0.alpha0+ (x64)
Build ID: 91cdf22b88a4f7bec243c8fb187627e766d3294c
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2019-03-08_00:38:10
Locale: en-US (de_DE); UI-Language: en-US
Calc: threaded

but not with

Version: 6.1.5.2 (x64)
Build ID: 90f8dcf33c87b3705e78202e3df5142b201bd805
CPU threads: 4; OS: Windows 10.0; UI render: default; 
Locale: de-DE (de_DE); Calc: group threaded
Comment 4 Kevin Suo 2019-05-01 10:24:52 UTC
Bibisected using the bibisect-linux-64-6.2 repo:

1a75d0939f3460074cc5d63464ea9cb32c8f1d53 is the first bad commit
commit 1a75d0939f3460074cc5d63464ea9cb32c8f1d53
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Fri Dec 7 19:25:34 2018 +0100

    source sha:180e5f515c9cd21fb8057c797a480eca7d9ed260

author Michael Stahl <Michael.Stahl@cib.de> 2018-11-27 18:28:41 +0100
committer Thorsten Behrens <Thorsten.Behrens@CIB.de> 2018-12-07 13:08:58 +0100
commit	180e5f515c9cd21fb8057c797a480eca7d9ed260
sw_redlinehide_4a: SwAutoFormat iterates frames, not nodes

Adding Michael Stahl to CC, would you please take a look.

git bisect log
# bad: [8b2cf76950e32ea46ff293ce75177841ad920e38] source sha:1149d20ce9f8682b58f98d3fa3bf289fc5974087
# good: [f741463dfe1900d3acf87b538c0c043e42bc523d] source sha:3a801799536e6870f2fb111b1cc00b9575a35a39
git bisect start 'master' 'oldest'
# good: [f2a6a57cb2d0fbe45c9cfafdcf33a816c6ced63f] source sha:da8617d69a7b27a3eeb3f26e207ddf1b4de3eeb3
git bisect good f2a6a57cb2d0fbe45c9cfafdcf33a816c6ced63f
# good: [6ed11ecb665d289c0981da1f10c42877a870e5a2] source sha:b44b9c4519794d159b154a9713c10da1155a5198
git bisect good 6ed11ecb665d289c0981da1f10c42877a870e5a2
# good: [b86f40dd1acc3d1baa3ffb3c15ac9f07eb53da66] source sha:f918e71d4e615fcc4527051a6e7f6bb4768d1269
git bisect good b86f40dd1acc3d1baa3ffb3c15ac9f07eb53da66
# bad: [9d66837c12695174f070b7b8484f4f7ea949237b] source sha:c873c0ad7bd0ebb5561adea33c8fdec3e49d234a
git bisect bad 9d66837c12695174f070b7b8484f4f7ea949237b
# good: [6a023218c43a758e50f9f268c38f6c924ba83ce9] source sha:c7fcf48ee7dffcff51d8bc9f41cfe55a71b56955
git bisect good 6a023218c43a758e50f9f268c38f6c924ba83ce9
# bad: [43309aec2caa53c2cd3f5e456cfc0db7da4894a6] source sha:56cbc21e0a5837726f4a68c311b68433ad5064d1
git bisect bad 43309aec2caa53c2cd3f5e456cfc0db7da4894a6
# bad: [3e7b1f719630225efec64c884f84c5432ce590ff] source sha:a0a8c3a1be9f69f89cbdfbb037261ca6ab64e3af
git bisect bad 3e7b1f719630225efec64c884f84c5432ce590ff
# bad: [4d529de17b0f77be5e23b99f612683bb30872e99] source sha:3b03604d1bb48fc1c1337307d0ba259dca9fbf1e
git bisect bad 4d529de17b0f77be5e23b99f612683bb30872e99
# bad: [0c8ef240b12c7202293478a5a81ab6ebcebbb5e2] source sha:1d3a07415eda3014d67d7c56466a8ad1d0ec51d9
git bisect bad 0c8ef240b12c7202293478a5a81ab6ebcebbb5e2
# bad: [1d98e7c9050ad4a7dbbfbf5742912cba990b3315] source sha:938f8a6b387828b8c18819184c47a5245bdfac8a
git bisect bad 1d98e7c9050ad4a7dbbfbf5742912cba990b3315
# good: [3ea0b3c4c565497a2db1fa168b96b85509f283d1] source sha:2d8454e0829244e7b78c94e48e0fffe1c8139122
git bisect good 3ea0b3c4c565497a2db1fa168b96b85509f283d1
# bad: [1a75d0939f3460074cc5d63464ea9cb32c8f1d53] source sha:180e5f515c9cd21fb8057c797a480eca7d9ed260
git bisect bad 1a75d0939f3460074cc5d63464ea9cb32c8f1d53
# good: [ffec9e5fdd9a577667bdd045fbbf82d2bfa45ea9] source sha:fd6e92793fa5baf1469d2dad89ff12f9ad656986
git bisect good ffec9e5fdd9a577667bdd045fbbf82d2bfa45ea9
# first bad commit: [1a75d0939f3460074cc5d63464ea9cb32c8f1d53] source sha:180e5f515c9cd21fb8057c797a480eca7d9ed260
Comment 5 Kevin Suo 2019-05-01 10:37:13 UTC
Short Steps to Reproduce:

1. Make sure that Tools -> AutoCorrect Options... -> Options -> "Ignore double spaces" is unchecked, and Tools -> AutoCorret -> When Typing is checked.

2. Open the attached test odt file, and hit enter at the end of the paragraph.

Current Result:
Some characters are deleted in the 4th and 5th line.
Comment 6 Xisco Faulí 2019-05-02 08:36:09 UTC
Adding Cc: to Michael Stahl
Comment 7 Commit Notification 2019-05-03 08:24:01 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/b69bc0facc6e0fbc2006125e656b82a7c2556203%5E%21

tdf#123285 sw_redlinehide: fix SwAutoFormat::DelMoreLinesBlanks()

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Michael Stahl (CIB) 2019-05-03 08:30:20 UTC
fixed on master
Comment 9 Commit Notification 2019-05-03 13:28:43 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-6-2":

https://git.libreoffice.org/core/+/b6d8403847c02c95b2b8570cffb68fd57a999ef4%5E%21

tdf#123285 sw_redlinehide: fix SwAutoFormat::DelMoreLinesBlanks()

It will be available in 6.2.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 John L. ten Wolde 2019-05-08 20:42:58 UTC
Hi. Original reporter here. Checked with 6.2.4.1 and yes, the behaviour appears to be back to normal. Thank you, gentleman. Fine work. I greatly appreciate it.
Comment 11 Dieter Praas 2019-05-09 06:17:43 UTC
(In reply to John L. ten Wolde from comment #10)
> Hi. Original reporter here. Checked with 6.2.4.1 and yes, the behaviour
> appears to be back to normal. Thank you, gentleman. Fine work. I greatly
> appreciate it.

=> VERIFIED FIXED