Bug 133077 - Pasting from Zoom only pastes first paragraph of multi-paragraph message
Summary: Pasting from Zoom only pastes first paragraph of multi-paragraph message
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.2 all versions
Hardware: All Windows (All)
: medium normal
Assignee: Mike Kaganski
URL: https://forumooo.ru/index.php/topic,8130
Whiteboard: target:7.0.0
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-16 06:06 UTC by Mike Kaganski
Modified: 2020-05-16 07:57 UTC (History)
0 users

See Also:
Crash report or crash signature:
Regression By:


Attachments
InsideClipboard capture of a multi-paragraph message copied from Zoom on Windows (1.46 KB, application/octet-stream)
2020-05-16 06:06 UTC, Mike Kaganski
Details
An InsideClipboard capture of long (>2048 characters) text with three paragraphs separated by CRs (8.55 KB, application/octet-stream)
2020-05-16 07:02 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2020-05-16 06:06:51 UTC
Created attachment 160884 [details]
InsideClipboard capture of a multi-paragraph message copied from Zoom on Windows

The attachment has a multi-paragraph message copied into Windows clipboard from a Zoom chat, and captured using InsideClipboard:

ausmessen	измерить
scharf	острый
das Maßband	рулетка
die Tapete	обоина
berichten	сообщать, докладывать
tapezieren	оклеивать обоями
abreissen	обрывать
Stromleitung	электропроводка
überprüfen	проверять
Pfötchen	лапки
der Hirsch	олень
einstellen	нанять
gerade dabei	как раз собираться

When pasting into Writer, it only pastes the first part of it (before line starting with "scharf"), while pasting into Word or Notepad or Chrome (above) produces full content (Word has each line as separate *paragraph*, not line breaks).

The clipboard contains the text (in CF_UNICODETEXT, CF_TEXT, and CF_OEM formats) separated by CRs (not by CRLFs, nor by LFs). Zoom does this likely to differentiate between breaks inside message vs. different messages, that would be separated by CRLFs. LO (on Windows) seems to drop everything starting from a stand-alone CR until the following CRLF.

We likely should treat stand-alone CRs as CRLFs, to follow how other Windows applications treat this.

Tested with Version: 7.0.0.0.alpha1+ (x64)
Build ID: 8a227630f1b4c8d592f33231977febb944be6a8e
CPU threads: 12; OS: Windows 10.0 Build 18363; UI render: default; VCL: win; 
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL

Ref: https://forumooo.ru/index.php/topic,8130
Comment 1 Mike Kaganski 2020-05-16 07:02:12 UTC
Created attachment 160886 [details]
An InsideClipboard capture of long (>2048 characters) text with three paragraphs separated by CRs

This specially crafted clipboard content has three lines separated by CRs, where first line is short, second one's length was chosen such that reading the clipboard contents into internal buffer (2048 Unicode characters) ends with its trailing CR, and the last line is also short. This test demonstrates that when CR happens on buffer boundary, we treat it as paragraph break, while it is treated incorrectly in the middle.

Problematic code is in SwASCIIParser::ReadChars (switch case handling 0x0d).
Comment 2 Commit Notification 2020-05-16 07:57:50 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/f33d2d8b5b7ae022eebfa3e22deac71351b3f4e1

tdf#133077: fix lone CR handling in plain text clipboard on Windows

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.