Bug 120630 - Importing CSV
Summary: Importing CSV
Status: RESOLVED WONTFIX
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.2.0.0.alpha0+
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-15 18:29 UTC by tes
Modified: 2018-10-16 19:09 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
examples and sysinfo (34.58 KB, image/png)
2018-10-16 14:39 UTC, tes
Details
example1 (90 bytes, text/csv)
2018-10-16 14:41 UTC, tes
Details
example2 (78 bytes, text/csv)
2018-10-16 14:42 UTC, tes
Details
example3 (78 bytes, text/csv)
2018-10-16 14:42 UTC, tes
Details
Versioninfo (46.96 KB, image/png)
2018-10-16 14:43 UTC, tes
Details
ImportDlg (34.58 KB, image/png)
2018-10-16 14:43 UTC, tes
Details
Video showing how can the import can be made keeping all text (660.03 KB, video/mp4)
2018-10-16 16:11 UTC, BogdanB
Details
screenshot comparing excel 2016 and lo 6.1.3.1 csv import (214.48 KB, application/pdf)
2018-10-16 17:56 UTC, Oliver Brinzing
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tes 2018-10-15 18:29:03 UTC
Description:
On csv-import LO interprets specific cells as numeric (and remove leading 0), while other cells in same col are text. 

Actual Results:
Import CSV with col PW and 2 lines. PW-line1=affenzahn, PW-line2=00654
Result after import is affenzahn, 654

Expected Results:
expected: affenzahn, 00654


Reproducible: Always


User Profile Reset: No



Additional Info:
do not play with results.
Comment 1 Mike Kaganski 2018-10-15 18:43:42 UTC
This is not a bug. CSV is a plain-text format for data interchange. If some value in it may be interpreted as number, and not quoted, then *by default* it should be treated as number, which happens here. It doesn't matter if previous line contained a text in that position: that night be a table header, or some intermediate summary.

If you need to have some column treated as text, just right-click the relevant column title *at CSV import dialog*, and select "Text".

Closing NOTABUG. Please feel free to reopen if you disagree; if so, then please describe your reasons.
Comment 2 tes 2018-10-16 12:51:39 UTC
1)
Sry, the imported values *are* quoted.  

2)
Who invented your default behavior?

3)
In my opinion importing should never change values by default ("0054320" -> 54320)

4)
see also RFC 4180
Comment 3 Mike Kaganski 2018-10-16 13:55:21 UTC
(In reply to tes from comment #2)
> 1)
> Sry, the imported values *are* quoted.  

This is what should have been stated from the start; please provide a sample to test; and also mention which settings are active in the CSV import dialog. Mention your version information (as per Help->About LibreOffice). Your initial description haven't mention that, and values in examples were unquoted.

Provided that information, we could try to understand if that is wrong behavior, but I don't get your following question:

> 2)
> Who invented your default behavior?

Well - *if* you argue with me on the grounds that I declared it correct to convert *quoted text* looking like numbers - then you are arguing with what has never been declared here. If, however, you tell me that you have doubts in logic behind converting *unquoted* numbers to numbers - then I simply fail to understand your logic. Without treating unquoted numbers as numbers, you would simply have *all* data from CSV as text - because, well, initially that's all text.

RFC 4180 does not specify any treatment of data inside the records, it only tells about syntax.
Comment 4 tes 2018-10-16 14:39:19 UTC
Created attachment 145754 [details]
examples and sysinfo
Comment 5 tes 2018-10-16 14:41:41 UTC
Created attachment 145755 [details]
example1
Comment 6 tes 2018-10-16 14:42:13 UTC
Created attachment 145756 [details]
example2
Comment 7 tes 2018-10-16 14:42:38 UTC
Created attachment 145757 [details]
example3
Comment 8 tes 2018-10-16 14:43:16 UTC
Created attachment 145758 [details]
Versioninfo
Comment 9 tes 2018-10-16 14:43:43 UTC
Created attachment 145759 [details]
ImportDlg
Comment 10 tes 2018-10-16 14:46:26 UTC
(In reply to Mike Kaganski from comment #3)
> (In reply to tes from comment #2)
> > 1)
> > Sry, the imported values *are* quoted.  
> 
> This is what should have been stated from the start; please provide a sample
> to test; and also mention which settings are active in the CSV import
> dialog. Mention your version information (as per Help->About LibreOffice).
> Your initial description haven't mention that, and values in examples were
> unquoted.

Sry for my inaccuracy.
 
> ...Without treating unquoted numbers as numbers,
> you would simply have *all* data from CSV as text - because, well, initially
> that's all text.

In my opinion the best default setting. 

thank you for your attention
Comment 11 BogdanB 2018-10-16 16:11:53 UTC
Created attachment 145764 [details]
Video showing how can the import can be made keeping all text

In this video you can see that Libre Office offer you the chance to keep the second column as text.

Please see the video. 

So this is not a bug.
Comment 12 tes 2018-10-16 16:27:01 UTC
(In reply to BogdanB from comment #11)
> Created attachment 145764 [details]
> Video showing how can the import can be made keeping all text
> 
> In this video you can see that Libre Office offer you the chance to keep the
> second column as text.
> 
> Please see the video. 
> 
> So this is not a bug.

Dear Bogdan,
thanks for your effort, creating the video. I already knew this "solution". I'm speaking about the default handling. The other way around is a shoe out of it. In my opinion the "default" has to be *text*. *NO* automatic interpreting/formating of data to import.
Comment 13 tes 2018-10-16 17:04:14 UTC
Never change data *automatic* on importing by a glance in the crystal. CSV *IS* plain text! Nothing else. Use XML, DBF or something else if you want formated columns.
Comment 14 Oliver Brinzing 2018-10-16 17:56:21 UTC
Created attachment 145769 [details]
screenshot comparing excel 2016 and lo 6.1.3.1 csv import

attached screenshot shows result for excel and lo csv import.

to make excel import csv data, i had to replace comma , with semicolon ;
Comment 15 tes 2018-10-16 18:50:15 UTC
(In reply to Oliver Brinzing from comment #14)
> Created attachment 145769 [details]
> screenshot comparing excel 2016 and lo 6.1.3.1 csv import
> 
> attached screenshot shows result for excel and lo csv import.
> 
> to make excel import csv data, i had to replace comma , with semicolon ;

Dear Oliver,
very nice. Additionally MS decided ; is better than ,. Bullshit on top. The MS behavior seems to be the same compared to LO. Both systems changes content automatically on loading where they don't have to do.
Comment 16 Mike Kaganski 2018-10-16 19:03:25 UTC
Please use "Format quoted field as text" checkbox to treat quoted values as text. See https://help.libreoffice.org/latest/en-US/text/shared/00/00000208.html#hd_id314847422.

(In reply to tes from comment #10)
> > ...Without treating unquoted numbers as numbers,
> > you would simply have *all* data from CSV as text - because, well, initially
> > that's all text.
> 
> In my opinion the best default setting. 

The best *default* setting is the one that works best for most cases for non-techy majority of users. Fir them, opening a CSV and being unable to sum values would be unexpected. Given the wast variety of in-the-wild CSV flavors, with huge inconsistencies and nuances, we provide defaults targeted at that group, and versatile dialog with telling labels. People who can read RFCs are expected to be able to guess the meaning of "Format quoted field as text".
Comment 17 Mike Kaganski 2018-10-16 19:05:00 UTC
I close this as WONTFIX, because I have described the reason behind the default behavior. We *won't* change the default in a way that will harm more people than those who would benefit.
Comment 18 Mike Kaganski 2018-10-16 19:09:18 UTC Comment hidden (spam)