Bug 151211 - EDITING paste tab separated unformatted text, unquoted field with embedded quote character gets all following data up to next quote or end of line appended
Summary: EDITING paste tab separated unformatted text, unquoted field with embedded qu...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3 all versions
Hardware: All All
: medium normal
Assignee: Eike Rathke
URL:
Whiteboard: target:7.5.0 target:7.4.3
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-28 10:09 UTC by Maximilian Kohler
Modified: 2022-10-02 18:45 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
text import settings (45.02 KB, image/jpeg)
2022-09-30 12:29 UTC, Maximilian Kohler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maximilian Kohler 2022-09-28 10:09:55 UTC
The data was received with Google forms and stored automatically in google sheets. I downloaded the sheets and try to paste unformatted into Libre. Everything after Column N gets added to N. This only happens for specific entries (not sure which/why). 

This doesn't happen in either Excel or Google sheets. This is unique to libreoffice calc. 

I can copy the row directly from google sheets or excel and paste it (not unformatted, just regular ctrl+v) into libre fine. 

It has nothing to do with libre's interpretation of "paste unformatted". IE: I can paste the data into Notepad to remove formatting, then copy and paste (normal ctrl+v) into libre and the same thing happens. 

Here's an example of the problematic text. Simply copy and paste this into libre calc. It should end at CT, but ends at N. I've anonymized the data, it's not private/identifying. 


7/19/2022 11:31:09		I certify that the foregoing is true, correct, and complete. And if I am selected I could be asked for medical reports to confirm my statements. I understand that the accuracy, truthfulness, and completeness of my answers is important for patient safety.	Michael Peter	mp1726@gmail.com	TikTok		Student	Type 2, Type 3	18-25	USA	Ohio	Male	5'9"	170	Not applicable (skip to next question)	Around 15%	I have taken antibiotics more than 1 year ago	I've only taken antibiotics a few times in my whole life	Almost always the same type	Once per day	I run and do bodyweight exercises daily. In highschool I ran XC, Swam and played Lacrosse. I Played D3 2 years ago but quit last year because it was to much of a time requirement and interfered with school. I still stay active and like to go on 8 mile runs every week to stay fit.	College	Daily or close to daily	Excellent	Vaginal birth	Not sure	Rarely or never	Rarely or never	Rarely or never	No problems	I eat lots of vegetables, fruits, and chicken. I eat sweets such as cake or candy maybe once a month. Never eat fast food and cook everything myself. I only drink water and average between half a gallon and a gallon a day (have been doing this for ~4-5 years from highschool).	No	No	No	No	Yes		Gates Mills	46640	No		No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	No	Yes	No	No	No	None	5-10	Never or very rarely	N/A	Under 30 minutes	Excellent	Rarely	No	No	No	No	No	No	No	No	Never	No	No	Less than once a year	Never	Braces, No issues or work done, Took off braces ~3 years ago. Never any cavities or anything.	No problems	Excellent	Average	Excellent	Sometimes	No	No	No	No	All healthy	White/Caucasian	12/18/2012		I understand.
Comment 1 Maximilian Kohler 2022-09-28 10:23:15 UTC
I think this is actually major or critical importance (but it won't let me change it) since there is no indication the data is not being pasted correctly. And if you're pasting over existing data then it all looks normal. I had to go back through my data and do tedious manual searches and corrections.
Comment 2 Eike Rathke 2022-09-28 10:32:25 UTC
Please *attach* a file with the sample text; as you can see line wrapping in a pasted comment breaks data so it can't be used to test this.
Comment 3 Eike Rathke 2022-09-28 10:37:45 UTC
> I downloaded the sheets and try to paste unformatted into Libre.
Downloaded how, in what format? Sounds like CSV? And from what application are you copying what you are trying to paste?
Comment 4 Maximilian Kohler 2022-09-29 10:35:58 UTC
No need to attach document. If you copy-paste directly from here into libre/excel it won't work, but if you paste into Notepad first, then it will work and you'll see the same behavior where it will paste fine into excel but not libre. 

>Downloaded how, in what format? Sounds like CSV? And from what application are you copying what you are trying to paste?

Downloading from google sheets, .xlsx or .csv both have the same result. I was unable to download into .ODS.

I'm copy-pasting a row, or multiple rows.
Comment 5 Eike Rathke 2022-09-29 15:55:10 UTC
Ok, if you think that the pasted paragraph above is *exactly* your original data..
And with that I can reproduce even copied from the browser here.
It contains apparently tabs as separators, and one field is 5'9" note the trailing " double quote character, which in CSV/TSV would a) have to be doubled and b) the field enclosed in extra double quotes; it should be "5'9""" here.
So first hand this is bad generated data.
Probably Calc is taking that as a start of a quote and looking for an end quote where there is none, hence all the remainder ends up in the same cell.
We are already working around all sorts of bad generated data (which actually may be the fault here), maybe this could be caught as well.
Comment 6 Eike Rathke 2022-09-29 19:40:26 UTC
So, the shortest reproducer of the above data is (I hope the tab between those two is preserved here)

5'9"	170
Comment 7 Eike Rathke 2022-09-29 20:59:43 UTC
To paste the raw data correctly, from the Paste Special (Shift+Ctrl+V) dialog for one line select 'Use text import dialog' (which comes up automatically for unformatted text of multiple lines) and in that dialog delete the 'String delimiter' " character.
Comment 8 Eike Rathke 2022-09-29 23:03:32 UTC
So this is actually not about CSV/TSV but the special handling of unformatted text pasted as TSV from clipboard, which is similar but a special case taking a different code path with different prerequisites.
Comment 9 Maximilian Kohler 2022-09-30 12:29:29 UTC
Created attachment 182752 [details]
text import settings

I confirm that "Paste Special dialog -> text import -> separated by tabs" works. Though it's tedious if we're needing to frequently use this, so it would be nice if "paste unformatted" detected it automatically like Excel and Google Sheets do. 

Also, see image. There's no way to "delete the 'String
delimiter' " character". 

I tested the "format quoted field as text" option, thinking that if I added double quotes around 170 for example, then it would import the quotes as well. But it didn't do that. Checked or not, it will not import double quotes around 170 (column O). 

Ah, I see that I CAN delete the string delimiter, and that fixes the 2nd issue mentioned above as well. That's very non-obvious since drop down menus are typically not editable. IMO there should be a "none" option in the drop down menu.
Comment 10 Eike Rathke 2022-10-01 21:08:35 UTC
It is a combobox, one can select values or edit the field, i.e. delete the character or type another one.
Comment 11 Commit Notification 2022-10-02 15:08:22 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/cf777cfcb22647b1f2d6ace307fbcc4f6d2cca30

Resolves: tdf#125110 tdf#151211 Disentangle the convoluted CSV/TSV-clip import

It will be available in 7.5.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Eike Rathke 2022-10-02 15:09:20 UTC
Pending review https://gerrit.libreoffice.org/c/core/+/140874 for 7-4
Comment 13 Commit Notification 2022-10-02 18:45:24 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-7-4":

https://git.libreoffice.org/core/commit/9af7a8d60596fc59f366a0c3e94489ff8fc106aa

Resolves: tdf#125110 tdf#151211 Disentangle the convoluted CSV/TSV-clip import

It will be available in 7.4.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.