160983 – RecalcOptimalRowHeightMode should be effective for CSV imports also

Bug 160983 - RecalcOptimalRowHeightMode should be effective for CSV imports also

Summary: RecalcOptimalRowHeightMode should be effective for CSV imports also

Status:	RESOLVED INSUFFICIENTDATA

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Calc (show other bugs)
Version: (earliest affected)	24.8.0.0 alpha0+
Hardware:	All All

Importance:	medium enhancement
Assignee:	Not Assigned

URL:	https://www.reddit.com/r/libreoffice/...
Whiteboard:
Keywords:

Depends on:
Blocks:	CSV-Dialog
	Show dependency tree / graph

Reported:	2024-05-07 23:06 UTC by Craig Ruff
Modified:	2025-01-04 03:16 UTC (History)
CC List:	5 users (show)

See Also:	123026 124098
Crash report or crash signature:

Attachments
sample CSV with a lot of text in one cell (6.04 KB, text/csv) 2024-05-23 01:12 UTC, Stéphane Guillou (stragu)	Details
Example CSV that exhibits the behavior when imported (727.44 KB, text/csv) 2024-05-23 14:47 UTC, Craig Ruff	Details
Sampled file after import (64.25 KB, application/vnd.oasis.opendocument.spreadsheet) 2024-05-23 16:35 UTC, m_a_riosv	Details
Screenshot import dialog. (71.98 KB, image/png) 2024-05-23 16:36 UTC, m_a_riosv	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Craig Ruff 2024-05-07 23:06:20 UTC

Description:
Please extend the use of RecalcOptimalRowHeightMode to work for CSV imports.

Steps to Reproduce:
1. Import a CSV file
2. 
3.

Actual Results:
Adjusting row height happens taking a very long time for large CSV files.

Expected Results:
Row height adjustment is user configurable just like ODF and XLS file opens.


Reproducible: Always


User Profile Reset: No

Additional Info:
N/A

Comment 1 m_a_riosv 2024-05-08 21:23:45 UTC

How to?, CSV is imported as unformatted text, what it is.

Comment 2 Craig Ruff 2024-05-08 23:28:41 UTC

Not sure why you are asking how to, when I import some CVS files with wide text columns or text columns that include new line characters, Calc first imports the CSV files then spends time "adjusting the row heights" according to the status message at the bottom of the window. The some rows of the CVS files end up being marked as 39.37" high, and scrolling in Calc is brain damaged such that you get large jumps and can't review them at all. Since there is no inherent height information in the CSV file, Calc is the one actor performing this behavior. It seems readily analogous to that being done when reading an ODF or XLS file.

Comment 3 m_a_riosv 2024-05-09 20:51:21 UTC

But these adjustments are not by the format, only length and new lines, something that LO knows when importing.
But there is no format on the csv text, like bold e.g.
The only thing is to apply the style you need to the data, by modifying one that already exist or creating a new one. Even you can modify default style.

Maybe better create a template just for that, first create a new file with that template then use
 Menu/Sheet
 - Insert sheet
 - Insert sheet from file
 - External links
for the two first you can set up it as link, so an update is possible., without recreated the import.

Comment 4 Stéphane Guillou (stragu) 2024-05-23 01:11:03 UTC

This has been also requested here: https://www.reddit.com/r/libreoffice/comments/nj3oll/can_i_have_preset_for_row_height_and_column_width/

I couldn't reproduce the issue with ridiculous row heights when importing a CSV. Do you have an example file?

Comment 5 Stéphane Guillou (stragu) 2024-05-23 01:12:06 UTC

Created attachment 194285 [details]
sample CSV with a lot of text in one cell

Not reproduce with this file and:

Version: 24.8.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: 101b08fe1ec77ffe8c1a9b2b8f9f20884269a1ed
CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: CL threaded

Comment 6 Craig Ruff 2024-05-23 03:40:56 UTC

I'll have to sanitize and pare one down to reasonable dimensions, they contain proprietary data.

Comment 7 Craig Ruff 2024-05-23 14:47:45 UTC

Created attachment 194309 [details]
Example CSV that exhibits the behavior when imported

CSV import options:
   From row: 1
   Separated by: comma
   String delimiter: double quote mark
   Format quoted field as text: yes

> scalc --version
LibreOffice 24.2.3.2 433d9c2ded56988e8a90e6b2e771ee4e6a5ab2ba

Comment 8 m_a_riosv 2024-05-23 16:35:37 UTC

Created attachment 194312 [details]
Sampled file after import

No issue for me with:
Version: 24.2.3.2 (X86_64) / LibreOffice Community
Build ID: 433d9c2ded56988e8a90e6b2e771ee4e6a5ab2ba
CPU threads: 16; OS: Windows 10.0 Build 22631; UI render: Skia/Raster; VCL: win
Locale: es-ES (es_ES); UI: en-US
Calc: CL threaded

Comment 9 m_a_riosv 2024-05-23 16:36:09 UTC

Created attachment 194313 [details]
Screenshot import dialog.

Comment 10 Stéphane Guillou (stragu) 2024-05-24 04:48:04 UTC

I see the "calculating" message and the high-height cells (e.g. at row 54) when importing the sample file with only comma as a delimiter.

Version: 24.2.3.2 (X86_64) / LibreOffice Community
Build ID: 433d9c2ded56988e8a90e6b2e771ee4e6a5ab2ba
CPU threads: 8; OS: Linux 6.5; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: CL threaded

I can see how it can be frustrating for users wanting to edit simple formats like CSV/TSV, expecting LO to handle it as simply as possible.

Ideally, you'd expect a simple on/off setting in the import dialog, along the lines of "Adapt row height to fit contents"? Or something different?

Justin and Balazs, you recently worked on RecalcOptimalRowHeightMode, what do you think? 

Copying the UX/Design team in for opinion.

Comment 11 Heiko Tietze 2024-05-24 08:43:19 UTC

(In reply to Stéphane Guillou (stragu) from comment #10)
> I see the "calculating" message and the high-height cells (e.g. at row 54)
> when importing the sample file with only comma as a delimiter.
Me don't, although it takes some milliseconds to process. Line 1197 aka log data point #2097 is more extreme, and I struggle with the use case of putting a novel into a single cell. The example encloses a long string with quotation marks and RecalcOptimalRowHeightMode does its job as expected.

> I can see how it can be frustrating for users wanting to edit simple formats
> like CSV/TSV, expecting LO to handle it as simply as possible.
Sure, if CSV is kept simple. But the example is some kind of log file with an (X)ML format.

Comment 12 ady 2024-05-24 10:44:17 UTC

(In reply to Stéphane Guillou (stragu) from comment #10)
(e.g. at row 54)

Cell B54 includes line-breaking characters. They should not be modified by the import process.

Either, Rows nor Columns get any modification (they would not adapt at all, leaving both axis with default values only, no matter the contents), or they get some kind of adapting to contents.

Some users won't like the first alternative, whereas others won't like the second possibility (and both groups will complain). I am not sure that simply changing the behavior would be considered as an improvement. I would agree to let users modify the respective sizes manually after importing, but other users won’t like that, especially newbies.

Providing choice (about one behavior or the other) for users might be seen as improvement, but that means adding some checkbox somewhere.

Comment 13 ady 2024-05-24 10:47:25 UTC

(In reply to Heiko Tietze from comment #11)

> Me don't, although it takes some milliseconds to process.

Same here, but maybe the attachment is just a simplified case(?).

Comment 14 Justin L 2024-05-24 12:28:56 UTC

You definitely want to have row height automatically calculated for a CSV import, since CSV cannot possibly specify a height for any row. Otherwise any multiline data (like row 54) would be "hidden" - looking like single line data (like row 53).

CSV is always a transitional format. You load it once, then format it according to your liking and then export it into an appropriate format.

Other than avoiding a performance penalty, I can't imagine any reason not to want to have the row height be calculated at load time. The proper response to being irritated about the speed is to optimize row height calculation.

[That said, I did say in my XLSX commit "I can't think of a reason why this shouldn't apply to all formats". As a code pointer, see https://gerrit.libreoffice.org/c/core/+/164721 ]

Comment 15 Justin L 2024-05-24 15:42:14 UTC

(In reply to ady from comment #12) 
> Some users won't like the first alternative, whereas others won't like the
> second possibility (and both groups will complain). I am not sure that
> simply changing the behavior would be considered as an improvement.
OP is not asking for a change in the default behaviour.

RecalcOptimalRowHeightMode is an advanced configuration option that affects whether rows that do not mandate a specific height should have that height calculated at import time.
  -yes (default - no change in behaviour)
  -no (useful for ODS which I believe knows the last-used-height at import)
  -ask

P.S. I just checked ScDocShell::ConvertTo SC_TEXT_CSV_FILTER_NAME and it does NOT set bSetRowHeights to true. So doing what OP asks would not be as trivial as I thought it might be.

Still a WONTFIX from me. Only useful for formats that already know an appropriate height.

Comment 16 Eyal Rozenberg 2024-06-05 18:29:22 UTC

(In reply to Justin L from comment #14)
> You definitely want to have row height automatically calculated for a CSV
> import, since CSV cannot possibly specify a height for any row.

The conclusion does not follow from the premise. That is, when importing a CSV, I often want all rows to have the same height, regardless of the contents - even if some cells have multiple line breaks. It's often more convenient to browse the contents that way (especially when the columns you're interested in  are not the multiline ones, and not even just then).

This is similar to how developers read commit logs and look at only the first line of the commit comment even if it's a multi-line comment.

Comment 17 Heiko Tietze 2024-06-06 06:54:12 UTC

We discussed the topic in the design meeting.

Apparently the newly introduced option RecalcOptimalRowHeightMode provides everything. "Apparantly" because whether 0,1, or 2 the imported example CSV was always recalcuclated (Windows, build from master). Balazs, can you please test this?

Missing documentation is tracked in bug 160179.

If the "ask" option was not introduced to cover this use case we have different opinions whether an option on the UI is needed. Some believe it is a not so rare use case other it has a negative impact on usability for most users.

Comment 18 QA Administrators 2024-12-04 03:12:06 UTC Comment hidden (obsolete)

Dear Craig Ruff,

This bug has been in NEEDINFO status with no change for at least
6 months. Please provide the requested information as soon as
possible and mark the bug as UNCONFIRMED. Due to regular bug
tracker maintenance, if the bug is still in NEEDINFO status with
no change in 30 days the QA team will close the bug as INSUFFICIENTDATA
due to lack of needed information.

For more information about our NEEDINFO policy please read the
wiki located here:
https://wiki.documentfoundation.org/QA/Bugzilla/Fields/Status/NEEDINFO

If you have already provided the requested information, please
mark the bug as UNCONFIRMED so that the QA team knows that the
bug is ready to be confirmed.
 
Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-NeedInfo-Ping

Comment 19 QA Administrators 2025-01-04 03:16:42 UTC

Dear Craig Ruff,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INSUFFICIENTDATA due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

Warm Regards,
QA Team

MassPing-NeedInfo-FollowUp