Bug 136013 - FILEOPEN Importing tsv/csv with no string delimiter causes whitespace only trailing column to corrupt
Summary: FILEOPEN Importing tsv/csv with no string delimiter causes whitespace only tr...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.0.0.3 release
Hardware: All All
: low normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, regression
Depends on:
Blocks: CSV-Import
  Show dependency treegraph
 
Reported: 2020-08-22 12:33 UTC by Andrew Crowe
Modified: 2020-12-15 11:43 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
CSV file that triggers issue (975 bytes, text/plain)
2020-08-22 12:36 UTC, Andrew Crowe
Details
Screenshot of initial import dialog display (69.79 KB, image/png)
2020-08-22 12:38 UTC, Andrew Crowe
Details
Screenshot of import dialog after changing settings (74.25 KB, image/png)
2020-08-22 12:38 UTC, Andrew Crowe
Details
Screenshot after file opens in calc (173.93 KB, image/png)
2020-08-22 12:39 UTC, Andrew Crowe
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Crowe 2020-08-22 12:33:21 UTC
Description:
When importing a tsv or csv without string delimiters, if the final column consists of only whitespace it adds corrupt data to that column.

If the final column is empty the row loads correctly. Also if the string delimiter is set to anything (even if the delimiter character does not appear in the document) the file loads correctly.

One interesting behavior is initially the csv import dialog doesn't show corruption in the preview, however if you change any options the corruption appears.

Tested reproducible on versions 5.4, 6.4, 7.0

Steps to Reproduce:
1. Have CSV/TSV file without string delimiters and with trailing column consisting of only whitespace
2. Turn off string delimiters in import dialog box
3. Click OK

Actual Results:
Right hand column contains corrupt data

Expected Results:
Right hand column blank


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Version: 7.0.0.3 (x64)
Build ID: 8061b3e9204bef6b321a21033174034a5e2ea88e
CPU threads: 24; OS: Windows 10.0 Build 19041; UI render: Skia/Vulkan; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: CL
Comment 1 Andrew Crowe 2020-08-22 12:36:47 UTC
Created attachment 164559 [details]
CSV file that triggers issue
Comment 2 Andrew Crowe 2020-08-22 12:38:20 UTC
Created attachment 164560 [details]
Screenshot of initial import dialog display
Comment 3 Andrew Crowe 2020-08-22 12:38:55 UTC
Created attachment 164561 [details]
Screenshot of import dialog after changing settings
Comment 4 Andrew Crowe 2020-08-22 12:39:27 UTC
Created attachment 164562 [details]
Screenshot after file opens in calc
Comment 5 Justin L 2020-12-15 11:43:29 UTC
Confirmed. The key is to erase the double-quote in the string-delimiter box.

Seems to have worked in LO 3.6.
Bibisected with bibisect-linux-43all to get the range https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=a1ac2538e9b287444500618ab4d2f0f06c25cf34..19f4ebd8a54da0ae03b9cc8481613e5cd20ee1e7

Nothing clearly obvious in this range, but various suspicious commits involving ICU and libexttextcat. 

Bad _bibisect 43all commit_ a67b874d60de1f1a44bef57a53a7b8a84db0ba58.