Bug 116921 - error importing CSV containing ;""some text"";
Summary: error importing CSV containing ;""some text"";
Status: CLOSED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.6.0.4 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected
Depends on:
Blocks: CSV-Import
  Show dependency treegraph
 
Reported: 2018-04-10 12:25 UTC by ced
Modified: 2019-02-05 16:03 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
file CSV not correctly imported (70 bytes, text/plain)
2018-04-10 12:29 UTC, ced
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ced 2018-04-10 12:25:09 UTC
Description:
importing CSV containing ;""some text"";
separated by ;
text separator "

get erroneous separation


Steps to Reproduce:
1.
open this file as CSV

abcd;abcd;abcd;abcd;
abcd;""abcd"";abcd;abcd;
abcd;abcd;abcd;abcd;

2.
set separated by ;
set text delimitator "

3.
press OK


Actual Results:  
abcd	abcd	abcd	abcd
abcd	"""abcd"";abcd;abcd;
abcd;abcd;abcd;abcd;
"		


Expected Results:
abcd	abcd	abcd	abcd
abcd	"abcd"	abcd	abcd
abcd	abcd	abcd	abcd




Reproducible: Always


User Profile Reset: No



Additional Info:
even preview is wrong




User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0
Comment 1 ced 2018-04-10 12:29:50 UTC
Created attachment 141260 [details]
file CSV not correctly imported
Comment 2 Jacques Guilleron 2018-04-10 16:15:38 UTC
Hi ced,

Works fine if you add a third " after what you consider as text. True?
Comment 3 ced 2018-04-10 17:32:31 UTC
ciao Jacques

yes, with
abc;""abc""";abc;
works fine

if you try to import 
abc;''abc'';abc;
2 single apex, with ' as delimiter 

and copy cells from Calc to notepad you get

abc	"'abc';abc;
"

look like if it finds 2 delimiters it surrounds all subsequent text with "



grazie per il fantastico lavoro che state facendo

    Emanuele
Comment 4 ced 2018-09-06 08:08:14 UTC
In new versione 6.0.6.2 the bug is still present
adddig by hand a third " after what you consider as text is not a solution.
Comment 5 Xisco Faulí 2018-09-13 10:35:25 UTC
I can reproduce it in

Version: 6.2.0.0.alpha0+
Build ID: 2da435922f9c1fcf52eb0c1eb3d6f73581e9f793
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); Calc: threaded

and

Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)

but not in

LibreOffice 3.5.0 
Build ID: d6cde0

The expected behaviour should be:
abcd	abcd	abcd	abcd
abcd	abcd""	abcd	abcd
abcd	abcd	abcd	abcd
Comment 6 ced 2018-09-13 11:26:20 UTC
I tried only on windows platform 32/64 bit

in Calc 5.X and actualy on Versione: 6.0.6.2
Build ID: 0c292870b25a325b5ed35f6b45599d2ea4458e77
Thread CPU: 4; SO: Windows 10.0; Resa interfaccia: GL; 
Versione locale: it-IT (it_IT); Calc: group

the bug is reproducible always.

file .csv
abc;""abc"";abc

Actual Results:  
abc	"abc";abc 

Expected Results:
abc	"abc"	abc
Comment 7 Aron Budea 2018-09-24 23:14:44 UTC
Bibisected to the following range using repo bibisect-43all:
https://cgit.freedesktop.org/libreoffice/core/log/?qt=range&q=1856186951a70a0bcac4e0c3632ca4afe68c05e3..d31997559adac6f03d932cb6c5819149c38c1398

Could be the following commit:
https://cgit.freedesktop.org/libreoffice/core/commit/?id=7928b651965f747b02593d2a9fc73fac7c86dbf5
author		Eike Rathke <erack@redhat.com>	2012-04-14 18:57:31 +0200
committer	Eike Rathke <erack@redhat.com>	2012-04-14 18:57:31 +0200

resolved fdo#48621 better handling of broken CSV files
Comment 8 Eike Rathke 2019-02-05 16:03:18 UTC
abcd;""abcd"";abcd

is malformed CSV if " is used as quote character (so-called String delimiter in the import dialog).

Proper CSV data for the expected

abcd  "abcd"  abcd

case would be

abcd;"""abcd""";abcd

See also https://tools.ietf.org/html/rfc4180 and http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm

This is also not a regression, the old import (abcd"") was just a different but equally not expected result.
Fix the generator instead that writes this broken CSV data.