Bug 57464 - RTF Copying across paragraphs creates broken ANSI content
Summary: RTF Copying across paragraphs creates broken ANSI content
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.0.0.0.alpha0+ Master
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: RTF-Paste RTF-Paragraph
  Show dependency treegraph
 
Reported: 2012-11-23 20:52 UTC by Urmas
Modified: 2022-10-24 03:42 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample (8.13 KB, application/vnd.oasis.opendocument.text)
2012-12-12 06:25 UTC, Urmas
Details
rtf tft copied - screenshot (143.13 KB, image/jpeg)
2013-02-02 19:57 UTC, headsup
Details
RTF dumps from the Clipboard (1.79 KB, application/zip)
2013-02-02 23:21 UTC, Urmas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Urmas 2012-11-23 20:52:41 UTC
1. Copy a non-ASCII text spanning several paragraphs in Writer
2. Inspect the created RTF data.

Only the first paragraph has correct ANSI bytes written. Subsequent paragraphs have broken values, apparently produced by taking a lower byte from the character's Unicode value.
Comment 1 Urmas 2012-11-23 21:14:56 UTC
Apparently in order to reproduce it the font of the copied text should be changed since the last save (or creation of a new document). From recently opened saved documents the proper values are copied.
Comment 2 Joel Madero 2012-12-11 16:47:20 UTC
Urmas: Can you attach a document with several paragraphs of non ASCII text? That'll help to triage and fix the bug tremendously
Comment 3 Urmas 2012-12-12 06:25:29 UTC
Created attachment 71376 [details]
Sample

See the garbage \'3x-\'4x in copied RTF for the text with newly applied font.
Comment 4 Jorendc 2013-01-23 17:34:18 UTC
(In reply to comment #0)
> 1. Copy a non-ASCII text spanning several paragraphs in Writer
> 2. Inspect the created RTF data.
> 
> Only the first paragraph has correct ANSI bytes written. Subsequent
> paragraphs have broken values, apparently produced by taking a lower byte
> from the character's Unicode value.

Do you have an example of non-ASCII text so I can test?

Kind regards,
Joren
Comment 5 Urmas 2013-01-23 19:41:08 UTC
See attachment.
Comment 6 Jorendc 2013-01-23 19:54:12 UTC
(In reply to comment #3)
> Created attachment 71376 [details]
> Sample
> 
> See the garbage \'3x-\'4x in copied RTF for the text with newly applied font.

Can't see any garbage when I follow your steps.

Kind regards,
Joren
Comment 7 Urmas 2013-01-24 01:52:51 UTC
If two last line changed, or only second line is, this is copied:

азец текста
Образец 2
_1@075F

If all lines are changed, this is copied:

Образец текста
_1@075F 2
_1@075F 3
Comment 8 Joel Madero 2013-02-02 19:49:47 UTC
Urmas - I apologize for putting this back in NEEDINFO but at this point we've had 3 developers and 4 QA people look at this and no one can reproduce with what's been given. If you can give VERY explicit steps to reproduce we'll look again.

One developer was guessing and checking at what was being said and at one point thought he saw something but then tried to do it again and failed. We've tried on multiple distributions and operating systems. We've also tested on multiple languages.

Please give exact steps of what to do and reopen this one. Thanks for your patience with this, we want to track down the issue but first we need someone to reproduce, with 7 of us trying and failing, we just need more explicit steps.

As always, thanks for helping us make LibreOffice a better produce for everyone
Comment 9 headsup 2013-02-02 19:56:18 UTC
Urmas, 

I looked at this at the request of Joel, I tested on w8 pro 32 bit, x86. I tested on 4.0.0.2 and 4.0.0.3. I was not able to replicate the issue. That may have been because I did not understand exactly how to replicate. I selected all the text, changed the font (to Arial) and then copied (Ctlr V) several times. I have attached a screen shot. 

Let us know if there are different steps needed to replicate the issue. Please test on at least 4.0.0.2.

Thanks.
Comment 10 headsup 2013-02-02 19:57:31 UTC
Created attachment 74104 [details]
rtf tft copied - screenshot
Comment 11 Urmas 2013-02-02 23:21:07 UTC
Created attachment 74108 [details]
RTF dumps from the Clipboard

I can reproduce it with just changing font, like to Lucida console in this example. Also it was originally detected by pasting text into Writer from other apps, but I cannot reproduce it reliably.

"before" is copied from a new document, "after" after that document was saved and re-opened.
Comment 12 Miklos Vajna 2013-02-03 16:18:53 UTC
Hi Urmas,

So I try to reproduce this -- but couldn't really find the exact steps to reproduce, and didn't figure it out myself. So I'm opening the document, copying the last two lines, pasting as RTF and all is fine.

What else should I *exactly* (please, step by step) do to reproduce this?

Thanks,

Miklos
Comment 13 QA Administrators 2013-09-24 01:58:55 UTC Comment hidden (obsolete)
Comment 14 Urmas 2013-09-24 04:29:50 UTC
As I said: Enter several paragraphs, select them, change font, copy.

New font will be exported with "\fcharset1", and ANSI characters after the first paragraph will be Unicode values truncated to 1 byte.
Comment 15 QA Administrators 2015-04-19 03:23:04 UTC Comment hidden (obsolete)
Comment 16 QA Administrators 2016-09-20 09:24:33 UTC Comment hidden (obsolete)
Comment 17 Yousuf Philips (jay) (retired) 2017-07-09 19:30:13 UTC
Setting to NEEDINFO as nobody confirmed this and clear steps to repo this arent available and i'm unable to repo it with steps in comment 1.
Comment 18 Urmas 2017-07-09 20:51:24 UTC
The necessary steps to reproduce it has been given.
The bug still persists.
If no one has found the time in the 5 years since it was reported, it is not my problem.
Comment 19 Yousuf Philips (jay) (retired) 2017-07-10 16:41:41 UTC
(In reply to Urmas from comment #18)
> The necessary steps to reproduce it has been given.
> The bug still persists.
> If no one has found the time in the 5 years since it was reported, it is not
> my problem.

If nobody is able to repo it with the steps you provide, provide better steps and dont set it to NEW unless others can repo it.
Comment 20 Buovjaga 2017-07-10 19:42:03 UTC
I did this:

1. Open attachment 71376 [details]
2. Select all and copy
3. Inspect clipboard with http://www.nirsoft.net/utils/inside_clipboard.html (I guess you could also do it with xclip)
4. Select all and change font to Lucida Console
5. Copy
6. Inspect clipboard, check the diff

It appears "the garbage \'3x-\'4x" is manifesting. I will show relevant lines.

Before font change:
\par \pard\plain \s0\widctlpar\hyphpar0\cf0\kerning1\dbch\af5\langfe2052\dbch\af8\afs24\alang1081\loch\f3\hich\af3\fs24\lang1035{\rtlch \ltrch\loch
\u1054\'3f\u1073\'3f\u1088\'3f\u1072\'3f\u1079\'3f\u1077\'3f\u1094\'3f 2}
\par \pard\plain \s0\widctlpar\hyphpar0\cf0\kerning1\dbch\af5\langfe2052\dbch\af8\afs24\alang1081\loch\f3\hich\af3\fs24\lang1035{\rtlch \ltrch\loch
\u1054\'3f\u1073\'3f\u1088\'3f\u1072\'3f\u1079\'3f\u1077\'3f\u1094\'3f 3}
}

After font change:
\par \pard\plain \s0\widctlpar\hyphpar0\cf0\kerning1\dbch\af6\langfe2052\dbch\af9\afs24\alang1081\loch\f3\hich\af3\fs24\lang1035{\rtlch \ltrch\loch\loch\f5\hich\af5
\uc2 \u1054\'84\'4f\u1073\'84\'71\u1088\'84\'82\u1072\'84\'70\u1079\'84\'78\u1077\'84\'75\u1094\'84\'88 2\uc1 }
\par \pard\plain \s0\widctlpar\hyphpar0\cf0\kerning1\dbch\af6\langfe2052\dbch\af9\afs24\alang1081\loch\f3\hich\af3\fs24\lang1035{\rtlch \ltrch\loch\loch\f5\hich\af5
\uc2 \u1054\'84\'4f\u1073\'84\'71\u1088\'84\'82\u1072\'84\'70\u1079\'84\'78\u1077\'84\'75\u1094\'84\'88 3\uc1 }
}

Version: 6.0.0.0.alpha0+ (x64)
Build ID: e0f67add2ec56706ce06a03572535266f21c0303
CPU threads: 4; OS: Windows 6.19; UI render: default; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2017-06-27_23:04:56
Locale: fi-FI (fi_FI); Calc: group
Comment 21 QA Administrators 2018-10-23 02:49:00 UTC Comment hidden (obsolete)
Comment 22 QA Administrators 2020-10-23 04:15:03 UTC Comment hidden (obsolete)
Comment 23 QA Administrators 2022-10-24 03:42:18 UTC
Dear Urmas,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug