Bug 122837 - Libreoffice covertly/silently butchers my text, breaks WSIWYG principle when paste as unformatted
Summary: Libreoffice covertly/silently butchers my text, breaks WSIWYG principle when ...
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-20 20:24 UTC by Zsolt
Modified: 2019-01-28 13:26 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
Examples of files with pasted text where the bug happens (one was pasted unformatted) (633 bytes, application/x-7z-compressed)
2019-01-20 23:12 UTC, Zsolt
Details
html-file with two spaces (8.98 KB, application/vnd.oasis.opendocument.text)
2019-01-28 13:20 UTC, Dieter
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zsolt 2019-01-20 20:24:47 UTC
Description:
I experience this super annoying bug. I even tried to disable auto-correct thinking that was the culprit.

In a html file, whenever I have a two (or more) spaces before between non-white text Libreoffice removes one of them, without notice of even showing of doing so.
Even if is the only change to the document to add the second space and on exit Libreoffice prompts to save the change is not retained. It's always rejected silently. Which is horrible. It shows something and something else happens.
(Actually it always reduces the number of spaces to one no matter how many I have)

Example: "ffmpeg -i  -acodec [...]"
I leave two spaces after -i where the input file of the command line goes, so that I could drag&drop them after pasting the command line into a command windows. There's no way in hell the two spaces are retained in the saved file save. Whenever I open the file only one space is there.

As far as I know this happened with all versions of Libreoffice I ever used.

Steps to Reproduce:
1

Actual Results:
2

Expected Results:
3


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Zsolt 2019-01-20 20:29:25 UTC
I didn't write steps, because I already explained everything I could.
Comment 2 Dieter 2019-01-20 21:22:51 UTC
I couldn't reproduce it with the following steps:

1. Open writer
2. Type "ffmpeg -i  -acodec [...]" (two spaces after -i)
3. Save as html file
4. Close and reopen the file

Actual result: two spaces after -i

Zsolt, please correct my steps, if they're wrong

Version: 6.1.4.2 (x64)
Build-ID: 9d0f32d1f0b509096fd65e0d4bec26ddd1938fd3
CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
Gebietsschema: de-DE (de_DE); Calc: group threaded
Comment 3 Zsolt 2019-01-20 23:12:52 UTC
Created attachment 148458 [details]
Examples of files with pasted text where the bug happens (one was pasted unformatted)

(In reply to Dieter Praas from comment #2)
> I couldn't reproduce it with the following steps:
> 
> 1. Open writer
> 2. Type "ffmpeg -i  -acodec [...]" (two spaces after -i)
> 3. Save as html file
> 4. Close and reopen the file
> 
> Actual result: two spaces after -i
> 
> Zsolt, please correct my steps, if they're wrong
> 
> Version: 6.1.4.2 (x64)
> Build-ID: 9d0f32d1f0b509096fd65e0d4bec26ddd1938fd3
> CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
> Gebietsschema: de-DE (de_DE); Calc: group threaded

Me neither, interesting. However if I cope paste the text from my document the bug happens. Even if I paste as plaintext, which is even more mysterious.
I'll attach sample files.

Another annoying aspect I forgot about earlier is that I'm prompted to save the document even when when I don't modify it only copy text. Also reproducible in the samples by copying the line.
Comment 4 Durgapriyanka 2019-01-22 17:48:33 UTC
Thank you for reporting this bug. I can reproduce this bug in

Version: 6.3.0.0.alpha0+
Build ID: 3c964980da07892a02d5ac721d80558c459532d0
CPU threads: 2; OS: Windows 6.1; UI render: default; VCL: win; 
TinderBox: Win-x86@42, Branch:master, Time: 2018-12-12_02:07:45
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

and in

Version: 6.1.3.2
Build ID: 86daf60bf00efa86ad547e59e09d6bb77c699acb
CPU threads: 2; OS: Windows 6.1; UI render: default; 
Locale: en-US (en_US); Calc: group threaded
Comment 5 Zsolt 2019-01-22 17:59:35 UTC
(In reply to Durgapriyanka from comment #4)
> Thank you for reporting this bug. I can reproduce this bug in
> 
> Version: 6.3.0.0.alpha0+
> Build ID: 3c964980da07892a02d5ac721d80558c459532d0
> CPU threads: 2; OS: Windows 6.1; UI render: default; VCL: win; 
> TinderBox: Win-x86@42, Branch:master, Time: 2018-12-12_02:07:45
> Locale: en-US (en_US); UI-Language: en-US
> Calc: threaded
> 
> and in
> 
> Version: 6.1.3.2
> Build ID: 86daf60bf00efa86ad547e59e09d6bb77c699acb
> CPU threads: 2; OS: Windows 6.1; UI render: default; 
> Locale: en-US (en_US); Calc: group threaded

Great! I guess this can be made into new, instead of unconfirmed.
Comment 6 Xisco Faulí 2019-01-23 12:34:09 UTC
(In reply to Zsolt from comment #3)
> Created attachment 148458 [details]
> Examples of files with pasted text where the bug happens (one was pasted
> unformatted)
> 
> (In reply to Dieter Praas from comment #2)
> > I couldn't reproduce it with the following steps:
> > 
> > 1. Open writer
> > 2. Type "ffmpeg -i  -acodec [...]" (two spaces after -i)
> > 3. Save as html file
> > 4. Close and reopen the file
> > 
> > Actual result: two spaces after -i
> > 
> > Zsolt, please correct my steps, if they're wrong
> > 
> > Version: 6.1.4.2 (x64)
> > Build-ID: 9d0f32d1f0b509096fd65e0d4bec26ddd1938fd3
> > CPU-Threads: 4; BS: Windows 10.0; UI-Render: Standard; 
> > Gebietsschema: de-DE (de_DE); Calc: group threaded
> 
> Me neither, interesting. However if I cope paste the text from my document
> the bug happens. Even if I paste as plaintext, which is even more mysterious.
> I'll attach sample files.

Which document do you mean? in both document attached, there's no double space after the -i. Please clarify...
Comment 7 Zsolt 2019-01-23 15:26:39 UTC
(In reply to Xisco Faulí from comment #6)
> 
> Which document do you mean? in both document attached, there's no double
> space after the -i. Please clarify...

Of course there isn't. Because it's impossible to save two spaces. You can only add them and try to save, then observe that the spaces aren't there when you re-open the document.

For the other part of the issue just copy the entire line of the document and see that you get prompted to save when you try to close it, for apparently no reason.
Comment 8 Buovjaga 2019-01-26 20:33:41 UTC
It is not LibreOffice butchering your text - it is HTML! You need to use non-breaking spaces. Type the two spaces with: Ctrl-Shift-space
You will notice they appear as grey boxes in LibreOffice.
Comment 9 Zsolt 2019-01-27 01:36:54 UTC
(In reply to Buovjaga from comment #8)
> It is not LibreOffice butchering your text - it is HTML! You need to use
> non-breaking spaces. Type the two spaces with: Ctrl-Shift-space
> You will notice they appear as grey boxes in LibreOffice.

Evene if this is true, it doesn't explain LibO asking to save when no changes to the document occured.
Also how do you explain the spaces being retained when you do the same in a new document.
Comment 10 Buovjaga 2019-01-27 12:59:39 UTC
(In reply to Zsolt from comment #9)
> (In reply to Buovjaga from comment #8)
> > It is not LibreOffice butchering your text - it is HTML! You need to use
> > non-breaking spaces. Type the two spaces with: Ctrl-Shift-space
> > You will notice they appear as grey boxes in LibreOffice.
> 
> Evene if this is true, it doesn't explain LibO asking to save when no
> changes to the document occured.
> Also how do you explain the spaces being retained when you do the same in a
> new document.

Ok, I took a closer look at this.

LibreOffice (confusingly) has two different methods to save html: File - Save as AND File - Export... (XHTML). They produce different results (yes, silly).

If you go via Save as, both spaces are saved as regular spaces, which means viewing the text in a browser will *render them* as a single space. The two spaces are retained inside the .html markup, but browsers will butcher the rendering as they please.

If you go via Export..., the first space will be saved as a regular space, but the second one will be saved as the Unicode character NO-BREAK SPACE: http://www.fileformat.info/info/unicode/char/00a0/index.htm
So the method of Export... is a bit sneaky, as it does not use the html markup   as one would expect. The browser does respect the NO-BREAK SPACE character and renders the result as two spaces.
Comment 11 Zsolt 2019-01-27 17:49:50 UTC
(In reply to Buovjaga from comment #10)
> Ok, I took a closer look at this.
> 
> LibreOffice (confusingly) has two different methods to save html: File -
> Save as AND File - Export... (XHTML). They produce different results (yes,
> silly).
> 
> If you go via Save as, both spaces are saved as regular spaces, which means
> viewing the text in a browser will *render them* as a single space. The two
> spaces are retained inside the .html markup, but browsers will butcher the
> rendering as they please.

Hmmm... in Comment 2 Dieter Praas used simple save as, yet LibreOffice showed the extra spaces, so why did that happen?

The spaces were retained for me too when I tried then, but now I realize it was only because I pasted text instead of typing.
If I paste the (plain?) text in my example in the bug description (without the quotes) and save normally, the spaces are are shown, not only by libreoffice but also the browser.
(Looking at the files it look pretty different when I paste from here, the text is in some sort of <pre> tag, instead of <p>, so I guess that some sort of formatting is copied and that helps)


> 
> If you go via Export..., the first space will be saved as a regular space,
> but the second one will be saved as the Unicode character NO-BREAK SPACE:
> http://www.fileformat.info/info/unicode/char/00a0/index.htm
> So the method of Export... is a bit sneaky, as it does not use the html
> markup &nbsp; as one would expect. The browser does respect the NO-BREAK
> SPACE character and renders the result as two spaces.


And why do you think the uncalled for save prompt on exit appears when I copy in the example files?
Nothing is changed yet I'm prompted to save. This doesn't seem right at all.
Comment 12 Zsolt 2019-01-27 17:50:25 UTC
(In reply to Dieter Praas from comment #2)
> I couldn't reproduce it with the following steps:
> 
> 1. Open writer
> 2. Type "ffmpeg -i  -acodec [...]" (two spaces after -i)
> 3. Save as html file
> 4. Close and reopen the file
> 
> Actual result: two spaces after -i
> 

Dieter, if you're still around can you still reproduce Libreoffice showing two normal spaces for a normal saved html file that's reopened?
Did you also paste text instead of actually typing?
Comment 13 Buovjaga 2019-01-27 18:08:38 UTC
(In reply to Zsolt from comment #11)
> Hmmm... in Comment 2 Dieter Praas used simple save as, yet LibreOffice
> showed the extra spaces, so why did that happen?
> 
> The spaces were retained for me too when I tried then, but now I realize it
> was only because I pasted text instead of typing.
> If I paste the (plain?) text in my example in the bug description (without
> the quotes) and save normally, the spaces are are shown, not only by
> libreoffice but also the browser.
> (Looking at the files it look pretty different when I paste from here, the
> text is in some sort of <pre> tag, instead of <p>, so I guess that some sort
> of formatting is copied and that helps)

Yes, <pre> elements are different: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/pre

> And why do you think the uncalled for save prompt on exit appears when I
> copy in the example files?
> Nothing is changed yet I'm prompted to save. This doesn't seem right at all.

I do not reproduce the save prompt, if I copy from libo-space-bug.html or libo-space-bug-unformatted.html
Comment 14 Dieter 2019-01-27 21:26:19 UTC
(In reply to Zsolt from comment #12)
> (In reply to Dieter Praas from comment #2)
> > I couldn't reproduce it with the following steps:
> > 
> > 1. Open writer
> > 2. Type "ffmpeg -i  -acodec [...]" (two spaces after -i)
> > 3. Save as html file
> > 4. Close and reopen the file
> > 
> > Actual result: two spaces after -i
> > 
> 
> Dieter, if you're still around can you still reproduce Libreoffice showing
> two normal spaces for a normal saved html file that's reopened?
> Did you also paste text instead of actually typing?

I couldn't reproduce, with copy text and paste it with Strg+V.
I could reproduce it, with copy and paste as unformatted text.

I haven't followed the discussion in detail, so I can't say, if this is a bug (I would say yes) or not.
Comment 15 Buovjaga 2019-01-28 12:01:11 UTC
(In reply to Dieter Praas from comment #14)
> I couldn't reproduce, with copy text and paste it with Strg+V.
> I could reproduce it, with copy and paste as unformatted text.
> 
> I haven't followed the discussion in detail, so I can't say, if this is a
> bug (I would say yes) or not.

Just to be clear on what you did, can you confirm that you
1. Typed or pasted as unformatted text to a new Writer document: ffmpeg -i  -acodec
2. Did Save as - HTML
3. Did File - Reload or close, open again (after this it renders with a single space in Writer for me)
4. Copied the text
5. Pasted with Ctrl-V to a new Writer document (result: single space for me)
6. Pasten with Ctrl-Alt-Shift-V to a new Writer document (result: single space)

Arch Linux 64-bit
Version: 6.3.0.0.alpha0+
Build ID: 68bdea37d79793bc8dff4672c2d360be3554b041
CPU threads: 8; OS: Linux 4.20; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 28 January 2019

Arch Linux 64-bit
Version: 6.1.4.2
Build ID: 6.1.4-4
CPU threads: 8; OS: Linux 4.20; UI render: default; VCL: gtk3_kde5; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group threaded
Comment 16 Dieter 2019-01-28 13:01:25 UTC
(In reply to Buovjaga from comment #15)
> Just to be clear on what you did, can you confirm that you
> 1. Typed or pasted as unformatted text to a new Writer document: ffmpeg -i 
> -acodec
> 2. Did Save as - HTML
> 3. Did File - Reload or close, open again (after this it renders with a
> single space in Writer for me)
> 4. Copied the text
> 5. Pasted with Ctrl-V to a new Writer document (result: single space for me)
> 6. Pasten with Ctrl-Alt-Shift-V to a new Writer document (result: single
> space)

For comment 15 I did only steps 1 - 3, but it is also possible to reproduce steps 4 - 6
Comment 17 Buovjaga 2019-01-28 13:05:37 UTC
(In reply to Dieter Praas from comment #16)
> (In reply to Buovjaga from comment #15)
> > Just to be clear on what you did, can you confirm that you
> > 1. Typed or pasted as unformatted text to a new Writer document: ffmpeg -i 
> > -acodec
> > 2. Did Save as - HTML
> > 3. Did File - Reload or close, open again (after this it renders with a
> > single space in Writer for me)
> > 4. Copied the text
> > 5. Pasted with Ctrl-V to a new Writer document (result: single space for me)
> > 6. Pasten with Ctrl-Alt-Shift-V to a new Writer document (result: single
> > space)
> 
> For comment 15 I did only steps 1 - 3, but it is also possible to reproduce
> steps 4 - 6

Can you then attach such an HTML file that shows 2 spaces in Writer?
Comment 18 Dieter 2019-01-28 13:20:16 UTC
Created attachment 148709 [details]
html-file with two spaces

file after step 3 in comment 2.
Comment 19 Buovjaga 2019-01-28 13:24:22 UTC
(In reply to Dieter Praas from comment #18)
> Created attachment 148709 [details]
> html-file with two spaces
> 
> file after step 3 in comment 2.

You attached an ODT file. Please attach the HTML file.
Comment 20 Buovjaga 2019-01-28 13:26:00 UTC
(In reply to Dieter Praas from comment #18)
> Created attachment 148709 [details]
> html-file with two spaces
> 
> file after step 3 in comment 2.

Ok, no need to attach, I see what you did there: you have changed the style to preformatted text. Obviously the 2 spaces will be rendered then. So still no bug seen.