Bug 155848 - CSV import improperly import floats as text prefixed by a hidden apostrophe
Summary: CSV import improperly import floats as text prefixed by a hidden apostrophe
Status: RESOLVED INVALID
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-15 09:18 UTC by Martin Mokrejs
Modified: 2023-08-11 14:34 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Mokrejs 2023-06-15 09:18:13 UTC
Hi,
  the CSV import is terribly broken and Libreoffice Calc too on its own.

The fields containing float numbers get imported as '0.65 instead of just 0.65.

Because the leading apostrophe is NOT show in the table but ONLY on the EDITIng line it took me ages to realize it. Instead I was just blindly to set the Format -> Cells -> Numbers to some othe Category that current "Number". It did not help because the leading apostrophe is in place. It should have been REMOVED by this action.

Once somebody told me about the Data -> Text to columns I could FIX the table importred via the broken import procedure but only column by colum (one cannot select multiple columns or cells spanning more than one column). One can do:

Data ->  Text to columsn -> Set the Other delimiter to the ' tick and click OK
Set the cell content to Number via Format -> Cells -> Numbers .

Then, you can use =SUM() or =AVG() functions over those cells to continue what used to work in the past.

SUM() and AVG() function could be smart enough to act on text fields and try to interpret the contents as numbers.

I do not see a point why during CSV import the contents should NOT be tried to be set to Number whenever possible. Or the import procedure should ASK user or provide a table for data format review upon import, emphasizing cells which are currently defined as TEXT but could have been Number.

Either way, if sombody goes to Format -> Cells -> Numbers it should remove those offending ticks or at least give an error message. It is a pity whole column because typically on first line we have column descriptions which are TEXT, right?

Ability to perform Data -> Text to columns trickery over multiple cells spanning more than one comlumn would be very helpful.



Yes, the thousands separator ',' got interpreted regardles of my locale as a decimal dot so 12,600 (twelve thousand 6 hundred) turned into 12,6. But that is another story.


Thank youf for any efforts to fix these.


Version: 7.5.4.2 (X86_64) / LibreOffice Community
Build ID: 50(Build:2)
CPU threads: 4; OS: Linux 6.1; UI render: default; VCL: gtk3
Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
Gentoo official package
Calc: threaded
Comment 1 Mike Kaganski 2023-06-15 09:32:38 UTC
If 0.65 appeared in a CSV, and got imported as text, and shows '0.65 in the edit line, all this means:

1. That the locale defined in the import dialog uses comma as decimal separator;
2. And that the locale set in your spreadsheet cells uses dot as decimal separator.

In essence, it is a user error of not setting the CSV import settings properly in the dialog, and not a bug. Of course, it's just an educated guess, in the absence of a sample CSV.

You should either set the locale in the import dialog to e.g. en-US (if all the data in the CSV uses those conventions), or you may select "USA" per-column there in the dialog, if the data happens to be mixed.
Comment 2 Eike Rathke (retired, only occasionally showing up) 2023-06-15 13:33:39 UTC
And for converting text to numeric (or date(+time)) after import see
https://wiki.documentfoundation.org/Faq/Calc/How_to_convert_number_text_to_numeric_data

(In reply to Martin Mokrejs from comment #0)
> Yes, the thousands separator ',' got interpreted regardles of my locale as a
> decimal dot so 12,600 (twelve thousand 6 hundred) turned into 12,6. But that
> is another story.
> Locale: cs-CZ (cs_CZ.UTF-8); UI: en-US
That pretty much suggest that the CSV import locale probably was also cs-CZ which has ',' comma decimal separator. Pay attention to the dialog and its settings.
Comment 3 Werner Tietz 2023-06-15 18:12:07 UTC

The purpose of a csv-import-DIALOG is NOT to press the "ok" button as soon as possible, but to choose the IMPORT options carefully!!