Bug 97080 - FILEOPEN: Need broader filter for delimited text files ("Text CSV" only includes *.csv)
Summary: FILEOPEN: Need broader filter for delimited text files ("Text CSV" only inclu...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
5.0.3.2 release
Hardware: All All
: lowest enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: difficultyMedium, easyHack, skillCpp
Depends on:
Blocks: CSV
  Show dependency treegraph
 
Reported: 2016-01-12 18:54 UTC by Mike Ruskai
Modified: 2021-03-27 01:34 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Ruskai 2016-01-12 18:54:18 UTC
It's a bit of a PITA to open a delimited text file with Calc, as the "Text" filter always loads the document in Writer, and the "Text CSV" filter only lists *.csv files.

The filter should be called something more appropriate like "Delimited Text", and include *.txt as well as *.csv files (and perhaps more).
Comment 1 David Tardon 2016-01-20 12:00:43 UTC
(In reply to Mike Ruskai from comment #0)
> It's a bit of a PITA to open a delimited text file with Calc, as the "Text"
> filter always loads the document in Writer

That is not true. If the file is opened from an existing Calc window, it will be opened in Calc too.

> and the "Text CSV" filter only
> lists *.csv files.

That is because this filter is import/export, and 'csv' is the extension for export. I don't know if there can be more than one.

This filter uses 'Text' type, which is also used by Writer. So the detection code contains a few simple heuristics that try to guess if the file should be opened in Writer or Calc. I think a possible course of action is to define a separate type for it, say 'Delimited Text'.

Possible workarounds: rename the file before opening it; open the file from an existing Calc window.
Comment 2 David Tardon 2016-01-20 12:07:15 UTC
What is needed here: Create a new filter type delimited_Text as a copy of filter/source/config/fragments/types/generic_Text.xcu . Change filter/source/config/fragments/filters/Text___txt___csv__StarCalc_.xcu to use it. Modify PlainTextFilterDetect::detect in filter/source/textfilterdetect/filterdetect.cxx to handle delimited_Text in addition to generic_Text and to always set CALC_TEXT_FILTER in that case. Check that various use cases work as expected: e.g., open a .txt/.csv file from an existing Calc/Writer window with specifying a type, then without it, then directly from the Start Center, etc.
Comment 3 Mike Ruskai 2016-01-20 14:11:47 UTC
(In reply to David Tardon from comment #1)
> (In reply to Mike Ruskai from comment #0)
> > It's a bit of a PITA to open a delimited text file with Calc, as the "Text"
> > filter always loads the document in Writer
> 
> That is not true. If the file is opened from an existing Calc window, it
> will be opened in Calc too.

I take it you're aware by now that this claim is not true.  Opening a delimited text file from Calc using the "Text" filter sends the file to Writer.

> > and the "Text CSV" filter only
> > lists *.csv files.
> 
> That is because this filter is import/export, and 'csv' is the extension for
> export. I don't know if there can be more than one.

Normal UI design uses a filter with all eligible extensions for opening a file, with a specific extension for saving a file.  The latter can either be through separate filters, or a filter with multiple extensions selected via a dropdown on the file dialog.

> This filter uses 'Text' type, which is also used by Writer. So the detection
> code contains a few simple heuristics that try to guess if the file should
> be opened in Writer or Calc. I think a possible course of action is to
> define a separate type for it, say 'Delimited Text'.

Such a separate filter is what I think makes the most sense.  It's really what "Text CSV" already is, save without the file extension limitation.  That filter is currently the only way to save a delimited text file of any kind.
Comment 4 Kohei Yoshida 2016-01-20 14:58:14 UTC
I'm the original author of the part of the detection code that handles generic text file type.

We need to be careful not to confuse format "types" and "filters".  A type only refers to a format type such as HTML, plain text etc, and is not tied to a specific application such as Writer or Calc.  A filter OTOH is tied to a single application.  We have one plain text filter for Writer and one plain text filter for Calc.  The one for Writer treats all plain text files equally whereas the one for Calc treats all plain text files equally as CSV, regardless of extension.  OTOH, we only have one generic text type that's application agnostic.  During detection phase, when a plain text "type" is detected, it does its best to tie that to a specific filter which in turn determines which application the file will be opened.

Part of the confusion comes from the fact that we do list all available "filters" as file "types" in the UI, which IMO is not correct.  As such, if you specify in the UI the "Text" type which really is a text filter for Writer, the file will be opened in Writer even if you do this from Calc window before what that "Text type" represents is the plain text "filter" for Writer.

Perhaps the right solution would be to change the UI label from "file types" to "file filters"?  Either way, we do need to make the correct distinction between file types and file filters and do understand the differences between the two before trying to come up with what would be the right solution for this.

Just my 2 cents.
Comment 5 Kohei Yoshida 2016-01-20 15:04:31 UTC
And I swear this UI label used to be "file filters" not "file types".... though my memory is a bit hazy.
Comment 6 jani 2016-01-20 15:07:53 UTC
Do not trust your memory, trust the version control system :-)
Comment 7 Kohei Yoshida 2016-01-20 15:12:04 UTC
(In reply to jan iversen from comment #6)
> Do not trust your memory, trust the version control system :-)

Absolutely right.  Would you mind looking into that? :-)  I might do it later when I find time but I'm a bit short on time at the moment.
Comment 8 Robinson Tryon (qubit) 2016-02-18 14:51:31 UTC Comment hidden (obsolete)
Comment 9 Mike Kaganski 2017-01-20 21:51:19 UTC
One problem here is that when I enter file mask into filename field, I expect selected import filter to remain the same, but file list should now follow selected file mask. (E.g., if import filter selected is CSV, it shows only CSV files - OK; now I enter *.txt into filename and press Enter - and I want to see list of TXTs, but the import filter should stay CSV).

What actually happens is that import filter changes to Text files (txt). So the only way to actually open the csv-in-txt is to type the file name by hand (or select it while TXT filter is active, so that it's in filename, and then change filter).

The above is on Windows.
Comment 10 jani 2017-05-14 07:43:22 UTC Comment hidden (obsolete)
Comment 11 Xisco Faulí 2017-06-14 02:22:36 UTC Comment hidden (obsolete)
Comment 12 Xisco Faulí 2017-07-15 02:31:49 UTC Comment hidden (obsolete)
Comment 13 tvallois 2017-07-17 08:52:16 UTC
Yes i'm still working on the issue. Haven't that much time for 4 months (Birth of my daughter :))
Comment 14 Xisco Faulí 2017-08-20 02:21:04 UTC Comment hidden (obsolete)
Comment 15 Xisco Faulí 2017-09-20 02:32:21 UTC Comment hidden (obsolete)
Comment 16 Xisco Faulí 2017-10-21 02:33:44 UTC Comment hidden (obsolete)
Comment 17 Xisco Faulí 2017-11-21 03:22:27 UTC Comment hidden (obsolete)
Comment 18 Mark Jeronimus 2017-11-22 15:20:21 UTC
This should not only include .csv and .txt! There can be an infinite number of file extensions with csv or tsv data. To name a few that I have worked with myself: *.tsv, *.log, *.cal, *.lmp, *.sts, *.ies. Some just happen to have data with a delimiter, others are designed but specific for a certain industry (e.g. ies are goniometric radiant flux measurements in csv format)
Comment 19 Xisco Faulí 2017-12-23 03:27:34 UTC Comment hidden (obsolete)
Comment 20 Xisco Faulí 2018-01-23 03:20:56 UTC Comment hidden (obsolete)
Comment 21 Xisco Faulí 2018-02-23 03:33:03 UTC Comment hidden (obsolete)
Comment 22 Xisco Faulí 2018-03-26 02:29:46 UTC Comment hidden (obsolete)
Comment 23 Xisco Faulí 2018-04-26 02:36:54 UTC Comment hidden (obsolete)
Comment 24 Xisco Faulí 2018-05-27 02:30:35 UTC Comment hidden (obsolete)
Comment 25 Xisco Faulí 2018-06-27 02:45:37 UTC Comment hidden (obsolete)
Comment 26 Xisco Faulí 2018-07-28 02:41:03 UTC Comment hidden (obsolete)
Comment 27 Xisco Faulí 2018-08-28 02:40:24 UTC Comment hidden (obsolete)
Comment 28 Xisco Faulí 2018-09-28 02:41:19 UTC Comment hidden (obsolete)
Comment 29 Xisco Faulí 2018-10-29 03:55:51 UTC Comment hidden (obsolete)
Comment 30 Xisco Faulí 2018-11-29 03:50:43 UTC Comment hidden (obsolete)
Comment 31 Xisco Faulí 2018-12-30 03:45:27 UTC Comment hidden (obsolete)
Comment 32 Xisco Faulí 2019-01-30 03:40:17 UTC Comment hidden (obsolete)
Comment 33 Xisco Faulí 2019-03-02 03:48:41 UTC Comment hidden (obsolete)
Comment 34 Xisco Faulí 2019-03-03 03:38:54 UTC Comment hidden (obsolete)
Comment 35 Xisco Faulí 2019-06-10 14:59:52 UTC
Dear tvallois,
This bug has been in ASSIGNED status for more than 3 months without any
activity. Resetting it to NEW.
Please assigned it back to yourself if you're still working on this.
Comment 36 Matt K 2021-03-24 00:36:39 UTC
Using version 7.1.1.2, there is a "Text documents" filter that seems to open the import dialog in Calc for .txt documents.  Also, other file formats can be filtered in the file name field (e.g. *.tsv) using the "All files" filter and still open with the import dialog in Calc.  Maybe just the default behavior for the "Text" filter should be changed to match the "Text documents" filter?

One thing I noticed is that filter is ignored if the file to be opened is already open -- not sure if this is expected.
Comment 37 Mike Kaganski 2021-03-24 08:10:54 UTC
(In reply to Matt K from comment #36)

There are several possible approaches here. And the most user-friendly one seems to be to simply /add/ *.* (see comment 18) or some specific extensions (txt in comment 0; tsv in your comment 36) to the text filter extension list, so that the dialogs show all files. Another one (see comment 9) would require users to know that they may enter new mask - not a common knowledge. The one from comment 2 IMO would be a bit of overkill...
Comment 38 Matt K 2021-03-27 01:34:16 UTC
(In reply to Mike Kaganski from comment #37)
> (In reply to Matt K from comment #36)
> 
> There are several possible approaches here. And the most user-friendly one
> seems to be to simply /add/ *.* (see comment 18) or some specific extensions
> (txt in comment 0; tsv in your comment 36) to the text filter extension
> list, so that the dialogs show all files. Another one (see comment 9) would
> require users to know that they may enter new mask - not a common knowledge.
> The one from comment 2 IMO would be a bit of overkill...

Do you mean add *.* to every text filter list (there are many, such as "Text documents" and "Text (StarWriter/Web)")?  Doing that, I think the issues raised in comment 3 would still exist; should we also address those issues?: (1.) opening a file in Calc with the "Text (*.txt)" filter sending the file to Writer (2.) Allowing the user to save a delimited text file with any extension rather than just .csv.