Bug 55049 - FILEOPEN: CSV import - \" is not seen as an escaped quote, and hence interpreted as string delimiters
Summary: FILEOPEN: CSV import - \" is not seen as an escaped quote, and hence interpr...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.6.1.2 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords:
: 118007 (view as bug list)
Depends on:
Blocks: CSV-Import
  Show dependency treegraph
 
Reported: 2012-09-18 11:55 UTC by Sander Bol
Modified: 2019-10-15 07:13 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
An example file containing a reduced test case. (18 bytes, text/csv)
2012-09-18 11:55 UTC, Sander Bol
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sander Bol 2012-09-18 11:55:02 UTC
Created attachment 67324 [details]
An example file containing a reduced test case.

Problem description: 
Loading a CSV file that contains the field delimiter (ie: semicolon) within a text field (ie: double quotes) should ignore the field delimiter. It is placed within the text-delimiters to indicate it is part of the text string, and should therefor not be treated as a field delimiter.

We found this bug when trying to load a CSV-export of a database table containing data stored in PHP's serialized data format.

Steps to reproduce:
1. Create a CSV file with the following contents:
"test";"\";\"";EOF
2. Load the file into Calc.

Current behavior:
File imported as follows (with pipes used to illustrate column separators).
test|\|\""|EOF

Expected behavior:
File imported as follows (with pipes used to illustrate column separators).
test|";"|EOF

Platform (if different from the browser): 
              
Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1
Comment 1 Sören 2012-12-29 21:46:53 UTC
After testing it in
Version 4.0.0.0.beta2 (Build ID: 4104d660979c57e1160b5135634f732918460a0)
on Ubuntu 12.04 (x86)
I think the problem has a different reason, namely that \" is not seen as an escaped quote, and hence interpreted as string delimiters. A CSV-file containing
"test";";t";EOF
gives 
test|;t|EOF
as expected, so the semicolon inside the quotes is indeed interpreted as part of the string, and not as an separator.

As I'm not aware of how CSV-import should handle escaped special characters, I'm unsure if this is a bug.
Comment 2 bfoman (inactive) 2013-03-22 10:32:05 UTC
Changed Summary to reflect real issue in the report.
Comment 3 gmolleda 2013-04-05 05:57:19 UTC
Confirmed error in LibreOffice Calc 4.0.1.2

Using menu Insert - Sheet From File...

If you create with text editor the file with text:
1<TAB>"text with <NEWLINE ENTER>
a newline character in text before close"<TAB>3

Save text, I have a newline into text field, the text field is with "

Insert this file in calc and you will see that the <new line> into "..." change the row and break the text field.

The error:
   A                                           B               C
1  1                                           text with
2  a newline character in text before close"   3

when the correct behaviour must be:
   A             B                                                 C
1  1             text with                                         3
                 a newline character in text before close
2
Comment 4 niknah 2013-04-24 00:30:16 UTC
In most other spreadsheets double quotes in .csv files are saved as ""

http://en.wikipedia.org/wiki/Comma-separated_values  (basic rules section)
http://tools.ietf.org/html/rfc4180 (section 2.7)

At the moment I can't see anyway to open up .csv files with values in double quotes.  The only way I can think of is to convert them into unicode double quotes.
Comment 5 Alex Thurgood 2015-01-03 17:40:47 UTC Comment hidden (no-value)
Comment 6 QA Administrators 2016-01-17 20:03:57 UTC Comment hidden (obsolete)
Comment 7 QA Administrators 2017-03-06 14:19:24 UTC Comment hidden (obsolete)
Comment 8 Buovjaga 2018-06-22 12:20:06 UTC
*** Bug 118007 has been marked as a duplicate of this bug. ***
Comment 9 QA Administrators 2019-06-23 02:51:46 UTC Comment hidden (obsolete)
Comment 10 Josep Lladonosa 2019-10-15 07:13:23 UTC
(In reply to Sander Bol from comment #0)
> Created attachment 67324 [details]
> An example file containing a reduced test case.
> 
> Problem description: 
> Loading a CSV file that contains the field delimiter (ie: semicolon) within
> a text field (ie: double quotes) should ignore the field delimiter. It is
> placed within the text-delimiters to indicate it is part of the text string,
> and should therefore not be treated as a field delimiter.
> 

I confirm that this bug is still present in LO 6.3.2.2.

Perhaps the whole module should be reviewed in order to take into consideration

https://www.rfc-editor.org/info/rfc4180

This would solve some of the comments here.