Bug 123285 - DATA LOSS: Writer's bundled paragraph styles dropping characters from text.
Summary: DATA LOSS: Writer's bundled paragraph styles dropping characters from text.
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
6.2.0.3 release
Hardware: All All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisectRequest, regression
Depends on:
Blocks: AutoCorrect-Complete
  Show dependency treegraph
 
Reported: 2019-02-09 10:39 UTC by John L. ten Wolde
Modified: 2019-03-10 20:47 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Before and After screenshots of the breakage annotated using Draw. (240.50 KB, image/png)
2019-02-09 10:39 UTC, John L. ten Wolde
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John L. ten Wolde 2019-02-09 10:39:51 UTC
Created attachment 149039 [details]
Before and After screenshots of the breakage annotated using Draw.

Hi all.  I upgraded from 6.1.4.2 to 6.2.0.3 last evening and am now seeing some truly odd behavior.  My install is from the Linux x86_64 RPM batch but I very much doubt that the issue I'm experiencing is OS specific.


BACKGROUND:

My current use case involves manuscripts and screenplays in traditional "old-school" typewriter format, thus mandating 10pitch mono-space fonts with two spaces between sentences and after colons, etc.  The key point here is that I always keep the AutoCorrect Options' "Ignore double spaces" setting unchecked.


PROBLEM:

Since the move to 6.2.0.3, I've been noticing that if a sentence division within a paragraph breaks between two lines (i.e. one line ends with a <PERIOD><SPACE><SPACE> combination and the following sentence is pushed to the next line) my upcoming Carriage Return will cause the two spaces at the division to be reduced to one space and all characters at the beginning of all subsequent lines to be themselves replaced by spaces.  The paragraph will then reflow with the text mangled.  Confusing right?  Hopefully the attached screenshots will clarify what I'm describing.


EXPERIMENTATION:  

I haven't been able to determine what's causing it, but I have discovered that the issue only plagues the most fundamental Paragraph Styles bundled with Writer.  I've so far confirmed the problem to affect the "Caption", "Heading", "First Line Indent", "Hanging Indent", and "Text Body Indent" styles.  Others will probably manifest the issue as well, but I haven't tested them all.  "Default Style" and "Text Body" are unaffected!

Based on the above, I wondered if paragraph indentation was causing the problem.  Nope.  I fiddled with the properties (indentation, linespacing, etc.) of the various out-of-the-box styles and could neither induce the bug in those unaffected, nor prevent it from occurring in those that were.  I created a number of custom styles at various points along the style tree and none of them displayed the problem.  This was weirdly true even if I "cloned" a malfunctioning one.  For example, I created an unmodified child of "First Line Indent" and it did *not* inherit the issue from its parent.


STEPS TO REPRODUCE:

1.  Open a new blank document.  Some steps below assume a new instance of the Default (A4) template.  You'll likely also want to Toggle Formatting Marks (CTRL+F10) to visible.

2.  Confirm that Tools -> AutoCorrect Options... -> Options Tab -> "Ignore double spaces" is unchecked.

3.  Populate the document with enough text to produce a paragraph several lines long and ensure that at least one line breaks (that is to say, *ends*) on two space characters.  In my screenshots I used the following:

---
Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.  Here is a short sentence demonstrating this very peculiar bug.
---


4.  Set the paragraph style to "First Line Indent".  If you copy-pasted my text from Step 3, set the font for all text to Liberation Serif (if it isn't already) and change the font size to 14.  This should cause a two-sentence division to break across the 4th and 5th lines with two space characters at the end of the 4th.  If you didn't use my text, use whatever font and size you please (it doesn't seem to matter) but, again, make sure at least one line ends with two spaces.

5.  Move to the end of the paragraph and press ENTER.


CONCLUSION:

I fully expect the paragraph to have become mangled.  If no one else can confirm or reproduce this (mis)behavior I'll scream and pull my hair out.  Well, no, not really.  Instead I'll revert to 6.1.5.x in the meanwhile but still be very, very annoyed.

Seriously though, I'm tempted to mark this bug as CRITICAL because data loss can occur if a user fails to notice the characters disappearing, saves the document, and then closes it.  If they are lost, putting all those missing characters back one-by-one, after-the-fact, across multiple paragraphs in a large document would be an absolute nightmare!  Note that CTRL+Z *will* undo the damage if it's caught in time while the document is still open.

Thanks in advance to all those who look into this matter, and thanks in general to everyone for all your continued hard work.
Comment 1 John L. ten Wolde 2019-02-10 21:33:59 UTC
Having gone back and (re)tested the old, familiar behavior with 6.1.4.2 and 6.1.5.2, I now see that two (or more) spaces at the end of a line were already being truncated to one in those versions. I likely never noticed this because the corruption of other characters wasn't occurring.

I didn't state an expected behavior in my initial report, but obviously it would be that, when the "Ignore double spaces" setting is unchecked, *all* characters input by the user (recurrent white-space or otherwise) should be preserved, intact, within a paragraph, regardless of their position on a line.
Comment 2 John L. ten Wolde 2019-03-08 21:45:26 UTC
Problem persists in 6.2.1.2.
Comment 3 Dieter Praas 2019-03-10 20:47:05 UTC
I confirm this with

Version: 6.3.0.0.alpha0+ (x64)
Build ID: 91cdf22b88a4f7bec243c8fb187627e766d3294c
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2019-03-08_00:38:10
Locale: en-US (de_DE); UI-Language: en-US
Calc: threaded

but not with

Version: 6.1.5.2 (x64)
Build ID: 90f8dcf33c87b3705e78202e3df5142b201bd805
CPU threads: 4; OS: Windows 10.0; UI render: default; 
Locale: de-DE (de_DE); Calc: group threaded