Bug 52236

Summary: CSV import/Export filter option for decimal point character
Product: LibreOffice Reporter: Luís Neves <luis.neves>
Component: CalcAssignee: Not Assigned <libreoffice-bugs>
Status: RESOLVED WORKSFORME    
Severity: enhancement CC: jmadero.dev, mwisnicki, stunts
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
Crash report or crash signature: Regression By:
Attachments: A CSV file with results from the EnergyPlus simulation software
Commas As Separator -- no additional steps
Result of Additional Steps Described by Reporter
CSV import dialog in QtiPlot that includes decimal point selection
decimal_test.ods
decimal_test_en_GB.csv
decimal_test_lt_LT.csv

Description Luís Neves 2012-07-18 15:00:18 UTC
A very useful enhancement to LibreOffice calc would be to have an option to specify the digital point character. When importing/exporting data from/to other software packages, namely simulation software, it is often necessary to execute a cumbersome procedure when our system local uses a different character, as in the Portuguese case, where we use the comma.

As an example, I usually have to import data as text, then I change all (with search-replace) the commas to semi-colons, then I change all the dots to commas and only then I can use the "text-to-columns" option to have the data on a useful form. 

I think this enhancement would be useful to several other [latin] languages as well.

Best regards,

Luís Neves
Comment 1 Joel Madero 2012-11-21 20:58:18 UTC
Is there any way that you can upload a document (simple) that has an exported csv file? Just so QA team can take a look at it and see if there is a better import method already available. Marking as NEEDINFO, if you can't upload the document just reopen as UNCONFIRMED and we'll see what we can do. If you do upload, please also reopen as UNCONFIRMED. Thanks!
Comment 2 QA Administrators 2013-07-18 06:16:09 UTC
Dear Bug Submitter,

Please read the entire message in its entirety before continuing - also please respond directly to FDO when replying - do not reply via email.

This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INVALID due to lack of needed information.

For more information about our NEEDINFO policy please read the wiki located here: 
https://wiki.documentfoundation.org/QA/FDO/NEEDINFO

If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed.


Thank you for helping us make LibreOffice even better for everyone!


Warm Regards,
QA Team
Comment 3 Luís Neves 2013-07-18 07:34:21 UTC
Created attachment 82572 [details]
A CSV file with results from the EnergyPlus simulation software

As many other other engineering software tools, the Energy Plus simulation software produces CSV text files as a way to export data to spreadsheet software. However, at least for portuguese users (and probably for many other locales that use the comma as decimal separator, this forces a somehow cumbersome procedure:

1st - Open as text without conversion

2nd - "Search-Replace" all commas with any other character, e.g., semi-colon

3rd - "Search-Replace" all dots with commas

4th - Convert text to columns using semi-colon as the column separator.

As I said before, it would be great to simply have the chance to define "dot" as the decimal separator when opening the file as it would save a lot of trouble. As it is the dots are understood as thousand separators and the commas as decimal separators and the results are unusable if we try to open files directly.


Best regards,

Luís Neves
Comment 4 retired 2013-07-18 09:33:06 UTC
Info provided, setting to UNCONFIRMED.
Comment 5 Joel Madero 2013-09-09 00:40:28 UTC
Well I do see a difference but I don't understand what the goal is. I attached two documents, one that I just opened with "," as the separator vs. opening the same file without a separator and following your steps.

The only thing I see is number format is changed, example:

1,247321E-01 vs. 0.1247321


and time format seems to change (which I think is associated with another issue that's been reported)

12:15:00 AM vs. 00:15:00


But these are all identical. Marking as NEEDINFO again - please compare the two cases and tell us what your goal is and why the "comma as separator" file is incorrect as I just don't see it.


Thanks!
Comment 6 Joel Madero 2013-09-09 00:41:06 UTC
Created attachment 85470 [details]
Commas As Separator -- no additional steps
Comment 7 Joel Madero 2013-09-09 00:41:33 UTC
Created attachment 85471 [details]
Result of Additional Steps Described by Reporter
Comment 8 Joel Madero 2013-09-09 01:59:59 UTC
Forgot to say - please mark the bug as UNCONFIRMED once you explain why file  Commas As Separator -- no additional steps is incorrect.


Thanks for your patience and help!
Comment 9 Domas Jokubauskis 2013-10-09 11:49:48 UTC
Created attachment 87333 [details]
CSV import dialog in QtiPlot that includes decimal point selection
Comment 10 Domas Jokubauskis 2013-10-09 12:14:03 UTC
The decimal point is dependant on locale not the language. Here in Lithuania we use "," for decimal point and "." for thousands separation. The problem with decimal points in CSV files arises when the CSV file is created using a computer on one locale and opened in a computer with another locale. 

Using LibreOffice 4.1.2.3 from Debian Testing I created example file "decimal_test.ods" that was saved as "decimal_test_en_GB.csv" in "en_GB.UTF-8" and as "decimal_test_lt_LT.csv" in "lt_LT.UTF8" locale (field separator is ";"). When "decimal_test_lt_LT.csv" is opened in "en_GB.UTF-8" locale the numbers with decimal points are imported as text and cannot be used in calculations without editing.

Please see attachment "CSV import dialog in QtiPlot that includes decimal point selection" for an example implementation of the decimal point character selection.
Comment 11 Domas Jokubauskis 2013-10-09 12:14:58 UTC
Created attachment 87334 [details]
decimal_test.ods
Comment 12 Domas Jokubauskis 2013-10-09 12:15:31 UTC
Created attachment 87335 [details]
decimal_test_en_GB.csv
Comment 13 Domas Jokubauskis 2013-10-09 12:15:49 UTC
Created attachment 87336 [details]
decimal_test_lt_LT.csv
Comment 14 Luís Neves 2013-10-09 12:27:48 UTC
Thanks Domas Jokubauskis,

That option of choosing the decimal character when opening the file is exactly what I wish to have in calc. The fact is that many engineering software programs produce ASCII files without consideration for the user LOCALE (because they are old pieces of software or because it's the standard output from programming languages like fortran). The example I attached is from the EnergyPlus software, a command line simulation tool for Energy in Buildings, which is very recent anyway.

Of course, a user with a US/UK locale will probably not understand the need because the decimal character is mantained.

Best regards,

Luís Neves
Comment 15 Marcin Wisnicki 2014-03-14 00:09:46 UTC
What more info is needed ? Current behaviour is useless for users of many non-english locales.

In addition to options please also add separate file type "Portable CSV" that will save CSV compliant with RFC4180.
Comment 16 Luís Neves 2014-03-14 15:49:25 UTC
I must say I'm a little disapointed with the lack of interest. I know I said it would be a desirable feature and not a bug or something abolutely necessary. In fact, the most  used spreadsheet software also lacks something like this, but well, it still have lots of other advantages and LO still needs to cativate users.
Comment 17 Joel Madero 2014-03-14 16:53:35 UTC
@Luis - we have several thousand bug reports to deal with - this is a community of volunteers. If you're interested in trying to submit a patch maybe we can find a mentor to assist.
Comment 18 Luís Neves 2014-06-16 15:07:37 UTC
Dear all,

I must apologise but only now I discovered that this problem no longer exists as we only need to specify the "language" on the file import dialog. Using "English (U.S:)" I can import easily txt files generated in software tools that do not use locale and the commas and thousand separators are correctly identified and changed in to the ones defined by my locale. I onnly wonder if this wouldn't be clearer with the label in the dialog being "LOCALE". I always regarded this as representing international character conversions.

Anyway, many thanks for the good work.

Luís Neves