Bug 130745 - FILEOPEN: Calc not honoring import text dialog column type for empty cells
Summary: FILEOPEN: Calc not honoring import text dialog column type for empty cells
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CSV-Dialog
  Show dependency treegraph
 
Reported: 2020-02-17 20:30 UTC by rscragun@gmail.com
Modified: 2022-02-23 05:38 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
test csv import file (68 bytes, application/vnd.ms-excel)
2020-02-18 17:38 UTC, Oliver Brinzing
Details

Note You need to log in before you can comment on or make changes to this bug.
Description rscragun@gmail.com 2020-02-17 20:30:03 UTC
Description:
Calc does not honor the column type selected in the text import dialog/wizard for empty cells.

Steps to Reproduce:
1. Paste the following into a text file and save as csv (indents are not part of the file and are included only to offset the text from the rest of the post):
    Column 1, Column 2
    Text 1,1234
    Text 2,1973-07-20
    Text 3,
    Text 4,

2. Open the file with LO Calc (in Windows, right click on the file, select Open with > LibreOffice)
3. In the "Fields" section of the text import wizard/dialog, select the top-left box in the sample spreadsheet to select all rows and columns.
4. For "Column type", select "Text".
5. Hit "OK".
6. Check the formats of the cells.

Actual Results:
All cells with contents will be text type. All cells without contents will be numeric type.

Expected Results:
All cells will be text type. This is particularly obvious for the empty cells in columns that have other non-empty cells. The fact that the wizard says that you are setting "column type" suggests that this is a column-specific setting rather than a cell-with-contents-specific setting.


Reproducible: Always


User Profile Reset: Yes



Additional Info:
Version: 6.3.4.2 (x64)
Build ID: 60da17e045e08f1793c57c00ba83cdfce946d0aa
CPU threads: 8; OS: Windows 10.0; UI render: GL; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

Example of why this matters:
The only reason I use Calc rather than Excel is that loading a CSV with dates into Calc without messing them up is much easier. But if empty cells get a different format from what is expected, there will be errors when I later type dates into those cells.
Comment 1 Oliver Brinzing 2020-02-18 17:37:58 UTC
(In reply to rscragun@gmail.com from comment #0)
> Actual Results:
> All cells with contents will be text type. All cells without contents will
> be numeric type.

confirming, but not sure if this can be seen as a bug or an enhancement request.

Maybe an option, e.g:
[ ] Format empty cells with choosen column type format
could be added ?
Comment 2 Oliver Brinzing 2020-02-18 17:38:31 UTC
Created attachment 157986 [details]
test csv import file
Comment 3 rscragun@gmail.com 2020-02-19 02:54:47 UTC
The fact that the dialog says "column type" suggests that it is a bug. The behavior is unexpected and can lead to data loss for people who expect that they are setting a column type.
Comment 4 QA Administrators 2022-02-19 03:43:00 UTC Comment hidden (obsolete)
Comment 5 Roman Kuznetsov 2022-02-22 20:55:29 UTC
Interesting. In

Version: 7.4.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: b438854a0b5148880e455cbeeff14d4e3d825711
CPU threads: 4; OS: Windows 6.1 Service Pack 1 Build 7601; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL Jumbo

I got a general Numeric type for all imported cells.

The Calc guide says us that "Text" in that case means Calc just imports the data from a CSV file as text but not as date or currency. And it's all. I don't see there that cells with data in Calc after import should have a text format.

Mike, what do you think about this bug report? May be it's really an enhancement?
Comment 6 Mike Kaganski 2022-02-22 21:17:10 UTC
(In reply to rscragun@gmail.com from comment #3)
> The fact that the dialog says "column type" suggests that it is a bug.

No. The dialog helps Calc to recognize the type of data in CSV, not format the sheet. So the dialog talks about type of *CSV* column (the data), not Calc column (the formatting).

Compare e.g. the other "column format" value, which is Date (DMY). Marking a column that way absolutely doesn't mean that the column will be formatted to show day, month and year in that order. It only means that the original data format should be expected like that. Then Calc can recognize the dates, and after that, it will represent them as current locale requires (which may be e.g. YMD or MDY). And if the column marked as DMY will happen to have a plain number, it will not be displayed as corresponding serial date.

I suppose that it would be harmful to implement such a proposal, since it would try to mix (and thus dilute, make less clear) the distinction between import data  format and resulting sheet formatting.

IMO->WF
Comment 7 rscragun@gmail.com 2022-02-23 02:04:27 UTC
If type of data is column-specific, and format is matched to data type, then shouldn’t the entire column be given the same format? 

Regardless of the philosophy behind settings in the dialog, there is a clear usability problem: someone who loads in data from a CSV file and then tries to edit those data in Calc will sometimes end up with corrupted data. There are many examples online of people asking how to avoid having Calc set formatting to dates when opening a CSV, and the responses always say to use this setting in this dialog. Here is an example: https://ask.libreoffice.org/t/how-do-i-format-a-column-of-calendar-dates-in-calc/55289. It appears, then, that preventing formatting a column as date is the expected behavior of setting the type to text.
Comment 8 Mike Kaganski 2022-02-23 05:38:34 UTC
(In reply to rscragun@gmail.com from comment #7)
> There are many examples online of people asking how to avoid having Calc set
> formatting to dates when opening a CSV, and the responses always say to use
> this setting in this dialog. Here is an example:
> https://ask.libreoffice.org/t/how-do-i-format-a-column-of-calendar-dates-in-
> calc/55289. It appears, then, that preventing formatting a column as date is
> the expected behavior of setting the type to text.

It is completely wrong interpretation. It confuses *formatting* (visual representation of data in a cell in Calc, defined by applied format string) with *data type* (the internal data stored in a cell, which may be a formula, a number, a text). The import dialog defines how Calc converts CSV text in specific fields into its data (i.e., it affects the resulting *data type*); but formatting is orthogonal, and is only used to show *converted*, already recognized data (such as, when a date is recognized, and converted to a serial date - i.e., a number of days since epoch, - it is also *formatted* in the Calc cell as a date, to not show numbers like 44562 where user expects 2022-01-01).

The cited question does not ask about "how to avoid having Calc set formatting to dates when opening a CSV". In addition to wanting something opposite, the main thing there - again - is not wanting any *formatting*, but wanting correct *data type import*, even though the user clearly also confuses the concepts.