Bug 165987 - "Text Import" dialog help: bad wording about "Column type" function
Summary: "Text Import" dialog help: bad wording about "Column type" function
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:25.8.0
Keywords:
Depends on:
Blocks:
 
Reported: 2025-03-31 14:05 UTC by Mike Kaganski
Modified: 2025-04-17 15:30 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2025-03-31 14:05:22 UTC
https://help.libreoffice.org/25.2/en-US/text/shared/00/00000208.html

The section about Column type is misleading. The function itself is not very intuitive, and help must do everything to clarify the crucial detail:

** this is about how the data is originally stored in the text file (CSV), not how it should be formatted when the import finishes **

Specific quotes that need fixing:

> Choose a column in the preview window and select the data type to be applied the
> imported data

What does "data type to be applied the imported data"? is "to" accidentally omitted? And what is "apply" verb intended to convey? Maybe it's me being non-native speaker, but I feel that we should not speak in terms of "applying" data type, but rather *assuming* that data type in the actual data in the text file.

> Date (DMY)      Applies a date format (Day, Month, Year) to the imported data in
>                 a column.

Oh! Here we see a VERY BAD (in this context) term "format". People already have hard time differentiating the ideas of data type vs. data format; and here, they are deliberately confused even more. The meaning of this (I don't have a good wording; my intention is to explain it to those who can design a better wording, when they grasp the idea): *assume* that the column in the CSV is a date, having "day" part first, then "month", then "year". This option fits, when the CSV has in this column things like "31.03.25", or "31/3/2025" - note that the details about number of digits, as well as the specific separator, is not as important, as the order of the parts. Note also, that this setting tells the program *how to treat the parts* - meaning that for a date like "3/4/25" in the CSV, it will know, that "3" is a day, and "4" is a month, not the other way round. Note also, that it doesn't tell Calc, how to *show* the resulting "May 31rd, 2025" or "April 4th, 2025" in the Calc cell (i.e., this setting is **NOT** about the formatting of the imported data in the cells on screen!!!).

So - in a word - the idea of "applying a format" is completely wrong here; instead of applying a format, we are assuming the kind (or the structure) of existing data.

Of course, the same applies to the "Date (MDY)" and "Date (YMD)" there.

> US English     Numbers formatted in US English are searched for and included
>                regardless of the system language. A number format is not
>                applied. If there are no US English entries, the Standard
>                format is applied.

So many mentions of "format" again! Of course, we can think about how the data is *formatted* in the CSV itself; but then we must be VERY careful to make a clear distinction, to not accidentally create a slightest doubt, that we could mean cell formatting in the import *result*. Or better - deliberately avoid the term "format" on this page at all. Specifically, here we mean, that "assume that the column in the CSV has numbers in English (US) standard notation, in particular, having dot as decimal separator". This is useful for locales where comma is decimal separator; and the "system language" is a bad term here, for *two* reasons: first, it's not "language" that is in play here, but *locale* (see bug 138748); and second, we contrapose this setting not to *system* settings, but to the *Locale* setting above in the *same document*; so this looks like "the dialog defines above, that locale X should be used, when reading the selected CSV in general; but for this column in particular, assume that the locale is English (US)". This is because this locate is used very often, even in the data that otherwise follows other locale conventions.
Comment 1 m_a_riosv 2025-03-31 20:55:54 UTC
+1
Comment 2 Olivier Hallot 2025-04-17 10:26:10 UTC
My suggestions, looking at the "Text" entry and using import verb...

"Date (DMY): The imported data is assumed as Day, Month and Year."
(similar for other dates)


For the US English 
"The imported numbers are interpreted using the US English locale, irrespective of the system's locale settings, and no specific number format will be applied. If there are no entries in the US English locale, the Standard format will be utilized."
Comment 3 Mike Kaganski 2025-04-17 10:34:54 UTC
(In reply to Olivier Hallot from comment #2)
> "Date (DMY): The imported data is assumed as Day, Month and Year."
> (similar for other dates)

Great! Maybe just append the "(in this order)" to your nice text?

> For the US English 
> "The imported numbers are interpreted using the US English locale,
> irrespective of the system's locale settings, and no specific number format
> will be applied. If there are no entries in the US English locale, the
> Standard format will be utilized."

Oh. I would *both* omit everything about format, *and* about system locale. Or better change the "system" locale into "locale chosen in the "Locale" combobox in the dialog".

So maybe:

"The imported numbers are interpreted using the US English locale, irrespective of the locale selected in the "Locale" combobox above."

and nothing more in this section?
Comment 4 Commit Notification 2025-04-17 15:30:34 UTC
Olivier Hallot committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/6a7e577f2c99d8f3f6cf3e85973ff0c69cc10014

tdf#165987 fix wording about "Column type" in CSV import.