Bug 32703 - CSV import could ignore leading spaces if the field content without them is quoted.
Summary: CSV import could ignore leading spaces if the field content without them is q...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.3.0 RC1
Hardware: All All
: medium enhancement
Assignee: Eike Rathke
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 38637 39868
  Show dependency treegraph
 
Reported: 2010-12-28 09:18 UTC by Ken Ward
Modified: 2012-04-16 04:53 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments
The import screen, showing the quoted string field being separated into three fields by embedded commas. (57.63 KB, image/jpeg)
2010-12-28 09:18 UTC, Ken Ward
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ken Ward 2010-12-28 09:18:27 UTC
Created attachment 41489 [details]
The import screen, showing the quoted string field being separated into three fields by embedded commas.

Found in:

LibreOffice 3.3.0 
OOO330m9 (Build:1)
libreoffice-build 3.2.99.2

CSV import into calc:

Options: 
See attachment libre_csv_bug1.jpg

Header line was:
"NAME", "ID", "VARIANT_NAME", "VARIANT_ID", "PARENT_ID", "INHERITS_FROM_ID", "TYPE", "DESCRIPTION", "MOD_DATE", "MOD_TIME", "REVISION", ...

Input line was:
"parm1", "82", "SFT2", "2", "58", "NA", "SPEC", "overridden in Part 4, LevelB2, SFT2", "20101228", "09:58:23", "3", ...
Comment 1 Kohei Yoshida 2011-01-21 07:04:37 UTC
Ken, can you still reproduce this in RC4?
Comment 2 Ken Ward 2011-01-21 07:17:42 UTC
I will download it and try.

Thanks!

-Ken

> https://bugs.freedesktop.org/show_bug.cgi?id=32703
>
> Kohei Yoshida<kyoshida@novell.com>  changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>    Status Whiteboard|                            |inforprovider:reporter
>             Keywords|                            |NEEDINFO
>            Component|Libreoffice                 |Spreadsheet
>
> --- Comment #1 from Kohei Yoshida<kyoshida@novell.com>  2011-01-21 07:04:37 PST ---
> Ken, can you still reproduce this in RC4?
>
Comment 3 Ken Ward 2011-01-21 07:44:31 UTC
Yes, the problem still exists in RC4.

Best regards,

-Ken Ward


> https://bugs.freedesktop.org/show_bug.cgi?id=32703
>
> Kohei Yoshida<kyoshida@novell.com>  changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>    Status Whiteboard|                            |inforprovider:reporter
>             Keywords|                            |NEEDINFO
>            Component|Libreoffice                 |Spreadsheet
>
> --- Comment #1 from Kohei Yoshida<kyoshida@novell.com>  2011-01-21 07:04:37 PST ---
> Ken, can you still reproduce this in RC4?
>
Comment 4 Bennet Huber 2011-05-17 19:40:58 UTC
I've noticed this problem too, that the CSV reader doesn't seem to respect quotes very well.  It doesn't handle newlines in quoted strings either (it just incorrectly interprets them as new rows).  This is a slightly trickier problem, because I'm pretty sure newlines aren't a legal character within a cell value, so they should probably be either skipped over or replaced with some user-definable string.

Also, there is an RFC for the csv format that might be helpful:
http://tools.ietf.org/html/rfc4180
Comment 5 Eike Rathke 2011-08-16 16:53:47 UTC
First of all, the generator putting leading blanks in front of quoted field content is violating the CSV specification previously mentioned. However, there seem to exist some generators of that kind and being lax on this when importing CSV may be desired.

Implemented in master http://cgit.freedesktop.org/libreoffice/core/commit/?id=acd31343d1a346f045a8145894c7e4451910cbf8

@ Bennet Huber: importing field content with newlines within a quoted field does work, it only didn't when leading blanks were present (or when embedded quotes aren't properly escaped, but that is a different story).
Comment 6 Mircea 2012-04-13 16:12:54 UTC
The bug is still present in LibreOffice version 3.5.2.2. I think it was incorrectly marked as "fixed" -- from the description of the commit, the fix was for leading spaces in csv files, not for the comma taking precedence over quotation marks.

Example of CSV file that leads to this bug:

="main_effect",="0.120",="0.090",="0.130",="0.112"
="",="(0.093)",="(0.095)",="(0.143)",="(0.138)"
="",="[-0.062,0.302]",="[-0.096,0.276]",="[-0.151,0.410]",="[-0.158,0.382]"

The third row should have exactly the same number of cells as the other rows, yet each cell containing an interval is split into two.
Comment 7 Eike Rathke 2012-04-14 15:18:38 UTC
1. Your example is different from what this bug originally was about, it
   does not contain spaces between the comma separator and a following
   quote.
2. Your example is not valid CSV data, see
   http://tools.ietf.org/html/rfc4180