Created attachment 203076 [details] Text Import dialog LO Calc future release LO Calc 25.8.1 concern: text import dialog Hi, this a general feature enhancements, which would slash down other bugs reports. Challenge: convert text to csv It's common to copy tables from text source (html, pdf, docx, etc) and the text import dialog lacks solutions. 1/ TEXT TO COPY E.g.copy & paste (CTRL+SHIFT+V) to open text import dialog the following: NVIDIA 4,249.99 5.44 Info Tech MICROSOFT CORP 3,577.70 4.58 Info Tech APPLE 3,467.20 4.44 Info Tech AMAZON.COM 2,188.03 2.80 Cons Discr META PLATFORMS A 1,603.83 2.05 Comm Srvcs BROADCOM 1,328.83 1.70 Info Tech ALPHABET A 1,239.14 1.59 Comm Srvcs ALPHABET C 1,049.09 1.34 Comm Srvcs TESLA 967.84 1.24 Cons Discr JPMORGAN CHASE & CO 837.67 1.07 Financials 2/ With detect you'll get sthg as, which is not what you want, namely: NVIDIA;4,249.99;5.44;Info;Tech;; MICROSOFT;CORP;3,577.70;4.58;Info;Tech; APPLE;3,467.20;4.44;Info;Tech;; AMAZON.COM;2,188.03;2.80;Cons;Discr;; META;PLATFORMS;A;1,603.83;2.05;Comm;Srvcs BROADCOM;1,328.83;1.70;Info;Tech;; ALPHABET;A;1,239.14;1.59;Comm;Srvcs; ALPHABET;C;1,049.09;1.34;Comm;Srvcs; TESLA;967.84;1.24;Cons;Discr;; JPMORGAN;CHASE;&;CO;837.67;1.07;Financials 3/ Your aim(that you can not presently attain due to fact that numerical value are not treated as separators): NVIDIA;4249.99;5.44;Info Tech MICROSOFT CORP;3577.70;4.58;Info Tech APPLE;3467.20;4.44;Info Tech AMAZON.COM;2188.03;2.80;Cons Discr META PLATFORMS A;1603.83;2.05;Comm Srvcs BROADCOM;1328.83;1.70;InfoTech ALPHABET A;1239.14;1.59;Comm Srvcs ALPHABET C;1049.09;1.34;Comm Srvcs TESLA;967.84;1.24;Cons Discr JPMORGAN CHASE & CO;837.67;1.07;Financials 4/ SOLUTION: Well LO Calc has already what is needed, but needs to brought into the "text import dialog". Attached screenshots are: 1/ text import dialog as current 25.8.1, 2/ text import dialog a proposed dirty mock modification for future release. In this mock dialog, I propose to had as separator new fields being regExp with a pull down list with choices such numeric, dates, etc This is similar to what already exists in Calc, when one change the format of cells(CTRL+1) and choose tab "Numbers" in which you can choose for instance dates as YYYY-MM-DD (iso 8601) or whatever. Additionnaly, I propose to have an indication of the consistancy of the output by having a first field the number of cols LO Calc detects for each row. So in the case of NVIDIA;4,249.99;5.44;Info;Tech;; MICROSOFT;CORP;3,577.70;4.58;Info;Tech; APPLE;3,467.20;4.44;Info;Tech;; AMAZON.COM;2,188.03;2.80;Cons;Discr;; META;PLATFORMS;A;1,603.83;2.05;Comm;Srvcs BROADCOM;1,328.83;1.70;Info;Tech;; ALPHABET;A;1,239.14;1.59;Comm;Srvcs; ALPHABET;C;1,049.09;1.34;Comm;Srvcs; TESLA;967.84;1.24;Cons;Discr;; JPMORGAN;CHASE;&;CO;837.67;1.07;Financials The output would be: 4;NVIDIA;4249,99;5,44;Info;Tech;; 5;MICROSOFT;CORP;3577,7;4,58;Info;Tech; 4;APPLE;3467,2;4,44;Info;Tech;; 4;AMAZON.COM;2188,03;2,8;Cons;Discr;; 6;META;PLATFORMS;A;1603,83;2,05;Comm;Srvcs 4;BROADCOM;1328,83;1,7;Info;Tech;; 5;ALPHABET;A;1239,14;1,59;Comm;Srvcs; 5;ALPHABET;C;1049,09;1,34;Comm;Srvcs; 4;TESLA;967,84;1,24;Cons;Discr;; 7;JPMORGAN;CHASE;&;CO;837,67;1,07;Financials Thank you
Created attachment 203077 [details] Text Import dialog LO Calc 258
Created attachment 203078 [details] Text Import example the table you wish to copy past into Calc Example table you wish to copy past into Calc
Oups, made a mistake in the output of adding a first field as the number of cols in the row, here's the correct : 5;NVIDIA;4249,99;5,44;Info;Tech;; 6;MICROSOFT;CORP;3577,7;4,58;Info;Tech; 5;APPLE;3467,2;4,44;Info;Tech;; 5;AMAZON.COM;2188,03;2,8;Cons;Discr;; 7;META;PLATFORMS;A;1603,83;2,05;Comm;Srvcs 5;BROADCOM;1328,83;1,7;Info;Tech;; 6;ALPHABET;A;1239,14;1,59;Comm;Srvcs; 6;ALPHABET;C;1049,09;1,34;Comm;Srvcs; 5;TESLA;967,84;1,24;Cons;Discr;; 7;JPMORGAN;CHASE;&;CO;837,67;1,07;Financials
Doing a \d digit match at word bounds \b (with or without decimal and thousands separator) as a column separator might be feasible, but IMHO dates are probably too complicated in general to be able to parse into columns. Although simple ISO 8601 formats might also be feasible. But isn't this rather a niche user request? Seems sed and awk, or perl, or python provide means to stream convert into a meaningful CSV format, external to LibreOffice, for import to calc or writer table. And, most sources will provide export choices to delimit both fields and text strings, suitable for ingestion without additional formatting. Other than convenience, no real justification to shoehorn this into the Text Import dialog as a means to describe complex Field Delimiter(s). IMHO interesting if a dev has interest to take on the refactoring of our Text Import dialog to provide delimiter for numbers between word bounds, and maybe also ISO dates. Otherwise => WF
Quickly looking at some bugs from the metabug 109239, I think: https://bugs.documentfoundation.org/show_bug.cgi?id=103597 "I would rephrase the idea to allow regular expressions as delimiters. Doing so adds a lot of flexibility while the approach is "well known"." This proposed solution appears to also solve this. https://bugs.documentfoundation.org/show_bug.cgi?id=122422 Add regular expression filter option to Text Import window This proposed solution appears to also solve this. https://bugs.documentfoundation.org/show_bug.cgi?id=114199 This proposed solution appears to also solve this. and I believe many other bugs can be crossed from metabug of 109239. Wouldn't you like to make the work of developers and the community much easier?