Load a smallish 230+MB .tsv file (154161x501); I will be creating Jumbo files; but need this to work before wasting time on the bigger files.. Except for Row 1 (6-digit YYMMDD date headers) and column A (ints), all cells are 2-character codes for which I want to apply background colors based on the contents. The conditions work fine on a test sheet of about 25x4 cells. Format->Conditional->Condition gives the dialog window as expected; but... if select the entire sheet first, I get the dislog window with the contents from the sheet, and calc goes 100% CPU. I've let it run for a long time and it never returns; has to be killed. When the full sheet of the 154161x501 range is selected, I never see the normal contents of the "Conditional Formatting for A1:xxxx" window. Also happens in version 7.4.0.3 which is hung as I write this report.
Some testing shows that it takes a while to get the dialog window to present the formatting panel: 5000x500 cells selected == 6 seconds 10000x500 cells selected -- 17 seconds 25000x500 cells selected -- 8 minutes 45 seconds No need for further tests as the processing time is clearly non-linear. Sigh... this kills my plans of using Calc. My largest sheets of this type are around 950,000x500.
I hate giving up... Here's a workaround in case that helps someone... Discovered that while a tad slow, though only a few seconds; I could select one column and apply 8 conditions to that column's 151K+ rows. Then, I edited the range from $B$2:$B$154161 to $B2:$SG$154161 which applied quickly across the entire range. I'll try this same workaround tomorrow on the 99 other sheets which are being created overnight (some expected to be 950Kx501)... Actually, the workaround should be even faster by applying the initial formatting to just a few cells, then edit the range... That said, it becomes obvious Calc loads the file, then applies the formatting before displaying the sheet -- that is quite slow too.
Would it be possible you attach a file with minimum size just to reproduce the pb? (by using this link https://bugs.documentfoundation.org/attachment.cgi?bugid=150712&action=enter) The goal is to retrieve a perf/Flamegraph so devs may pinpoint the bottleneck.
Created attachment 182114 [details] AA.tsv (zip'ed) Sure. The full 100 files got created overnight; they range in size: 5827354 Aug 31 02:56 EE/EE.tsv (3870 rows) 1793058309 Aug 31 02:41 CW/CW.tsv (1189571 rows) The one I started with is: 232277885 Aug 31 01:58 AA/AA.tsv (154161 rows) All files have 501 columns. I'm also including the final .ods file so you can see the result I need to produce on these 100 files; this was created using the previously mentioned workaround. The .ods file: - Loading is reasonable speed. - Adapt row height is slow compared to load. - The file takes right at 3 minutes to load. Generating a .pdf is also very slow; as I write this, it's taking about 7 minutes for 10% on the progress bar. Generating the .pdf as a single page if that works...
I was hoping to create a reasonable .pdf for my team to avoid the long load time. It took about 70 minutes to created the pdf of 132,174 pages with 48 rows by 12 columns. The most important part of this data is to be able to view the rows left to right; scrolling left to right. Scrolling top to bottom is barely useful. So it appears my only way to distribute these results is the .ods file itself. Calc may the wrong tool for these reports; but that's my challenge... There are more codes than the current colors depict; and it's already challenging for the other people to recall what the code means. It's not easy summarizing terabytes of data... Thanks for the 7.5 builds!
Created attachment 182121 [details] Flamegraph Here's a Flamegraph retrieved on pc Debian x86-64 with master sources updated today. Here what I did: - load the tsv file - select the whole sheet => start perf recording - Menu Format/Conditional/Condition => Wait about 30 secs (still not finished) => stop recording
Pierre: I understand that there are slowdowns at different moment but let's focus on the initial description, ie Menu Format/Conditional/Condition after having selected a whole sheet. Noel: any idea if this part may be optimized?
I gave a try with this patch: diff --git a/svx/source/dialog/fntctrl.cxx b/svx/source/dialog/fntctrl.cxx index 58b412f1e822..04af6deae0b6 100644 --- a/svx/source/dialog/fntctrl.cxx +++ b/svx/source/dialog/fntctrl.cxx @@ -124,15 +124,9 @@ void setFont(const SvxFont& rNewFont, SvxFont& rImplFont) */ bool CleanAndCheckEmpty(OUString& rText) { - bool bEmpty = true; - for (sal_Int32 i = 0; i < rText.getLength(); ++i) - { - if (0xa == rText[i] || 0xd == rText[i]) - rText = rText.replaceAt(i, 1, u" "); - else - bEmpty = false; - } - return bEmpty; + rText = rText.replaceAll("\xa", u" "); + rText = rText.replaceAll("\xd", u" "); + return rText.isEmpty(); } } // end anonymous namespace and I didn't wait for 30 secs to have the results
To confirm, this workaround is super fast (to help anyone needing a fast solution) to conditionally format an large|jumbo sheet: - using a previously formatted [sample] sheet: Copy a cell which has desired conditional formatting - On sheet needing format: - locate bottom-right of HUGE range (note cell) - return to top-left of range - select a few top-left cells to be formatted - Edit->Paste->Paste Special... Click Formats Only - Format->Conditional->Manage... - Selected range is highlighted - click Edit... - Range: (at bottom) Change to cover the huge range you need formatted (e;g., B2:SG320976 (yes, that's 160,488,000 cells) - Click OK - Click OK Formatting is applied instantly. A few more steps than the expected manner; but this is FAST w/o 100% CPU.
(In reply to Julien Nabet from comment #8) > I gave a try with this patch: Patch looks good to me
(In reply to Noel Grandin from comment #10) > (In reply to Julien Nabet from comment #8) > > I gave a try with this patch: > > Patch looks good to me Thank you for the feedback, I submitted this patch https://gerrit.libreoffice.org/c/core/+/139152 as you may have seen.
off-topic: just wonder if, more generally, some other locations containing a loop which includes replaceAt could be optimized.
Julien Nabet committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/076be52655f2cf4e331b344e0f6497f41b37fa8c tdf#150712: Format->Conditional->Condition 100% CPU It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Julien Nabet committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/e4d9d522fec5770dd25f99384ae7a012636662cb Revert "tdf#150712: Format->Conditional->Condition 100% CPU" It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I reverted the previous patch since I had changed the logic
The critical function was created in a fix I had made for tdf#113657 (added in See also). I'll take a look if instead of space we can replace by empty string.
Reading git history, the space was already there, in fact it's there from: c6726f343e8a944fc9f1031ca39520c1527b05e6 INTEGRATION: CWS sw016 (1.14.106); FILE MERGED in 2003
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/bb7747f627fe7e909fca7264fd42fe6cb7aef0b5 tdf#150712 use more string_view It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Julien Nabet committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/d8326fc81f7225047178e1b8a7605d32109fa08c tdf#150712: Format->Conditional->Condition 100% CPU (2nd try) It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/b24eceb5178fda17fcc4b20c5286e8d27c5ae474 tdf#150712 limit the dialog to only a sample of the data It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Julien Nabet committed a patch related to this issue. It has been pushed to "libreoffice-7-4": https://git.libreoffice.org/core/commit/ab436b8cd80e100314832ffc2ef54a91d843859f tdf#150712: Format->Conditional->Condition 100% CPU (2nd try) It will be available in 7.4.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Noel Grandin committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/b392d0a25e229f305497ab5f9f084b71b24cc387 tdf#150712 reduce cost of SharedString::getString It will be available in 7.5.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
On pc Debian x86-64 with master sources updated today, the dialog appears right away. It's thanks to Noel's patch so unassign myself and put Noel instead. Thank you Noel! I cherry-picked for review on 7.4 branch these: https://gerrit.libreoffice.org/c/core/+/139186 https://gerrit.libreoffice.org/c/core/+/139210 Let's put this one to FIXED now.
Noel Grandin committed a patch related to this issue. It has been pushed to "libreoffice-7-4": https://git.libreoffice.org/core/commit/4cef3204e011b235b807d26ddb3c901690b633ff tdf#150712 limit the dialog to only a sample of the data It will be available in 7.4.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.