Consider the code > sub DataArray > const maxRow = 900000 > dim data(0 to maxRow) > for i = 0 to maxRow > if (i mod 10 = 0) Then data(i) = array("0") else data(i) = array(CStr(i)) > next i > range = ThisComponent.Sheets(0).getCellRangeByPosition(0,0,0,maxRow) > t = now() > range.setDataArray(data()) > MsgBox Format(now() - t, "[s] "" s""") > end sub Running in a new spreadsheet, setDataArray takes ~7-8 s on my system. But replacing `array("0")` with `array("")` makes it execute longer than ~800 s. So the empty strings in the data array slow down the function manyfold. Tested with Version: 6.3.4.1 (x64) Build ID: a21169d87339dfa44546f33d6d159e89881e9d92 CPU threads: 12; OS: Windows 10.0; UI render: GL; VCL: win; Locale: ru-RU (ru_RU); UI-Language: en-US Calc: threaded
Additional detail: replacing `array("0")` in the code sample with `array(" ")` - i.e., keeping it a non-empty string, but with a space instead of a zero character, makes the setDataArray function complete in ~30 s, which is also unexpected slowdown.
Saving to an ODS and reloading each test result gives these results: 1. Saving and reloading results of original code using `if (i mod 10 = 0) Then data(i) = array("0") else data(i) = array(CStr(i))`, the reload takes ~13 s 2. Saving and reloading results of test in comment 1 using `if (i mod 10 = 0) Then data(i) = array(" ") else data(i) = array(CStr(i))`, the reload takes ~35 s 3. Saving and reloading results of test in comment 0 using `if (i mod 10 = 0) Then data(i) = array("") else data(i) = array(CStr(i))`, the reload takes ~80 s So there is a good correlation between setDataArray performance and reload time for #1 and #2, with ~5 s constant overhead in reloading; but for #3 with empty strings, setDataArray performs about 10 times slower than reloading its resulting data from file.
confirm in Версия: 6.5.0.0.alpha0+ (x64) ID сборки: 89f0af144c18efafe2573801641689a1432c0cae Потоков ЦП: 4; ОС:Windows 10.0 Build 18362; Отрисовка ИП: по умолчанию; VCL: win; Локаль: ru-RU (ru_RU); Язык интерфейса: ru-RU Calc: threaded It takes 19s vs 491s on my machine (when replacing `array("0")` with `array("")`)
Julien, can you make your nice perfgraph here?
(In reply to Roman Kuznetsov from comment #4) > Julien, can you make your nice perfgraph here? No pb, I'll do it when get back home.
i can confirm: LO 3.6.7.2 (Build ID: e183d5b) and AOO 4.1.5: ~ 5 sec for "0" and "" with LO 4.4.7.2 and LO 6.1/6.3 branch: ~11 sec for "0" and ~350 sec for ""
Started between 4.0 and 4.1
Created attachment 156310 [details] perf flamegraph Here's a Flamegraph with maxRow=90000 and array("") (8seconds)
Ha :-) Our data structures are very heavily optimised for continuous runs of data of the same type. In this case, the first macro hits the heavily optimised path and runs very fast. The second macro introduces discontinuities (because every 10th element is a different type), which pushes us off the optimised path. In the first case, we end up with one block of MDDS data, and in the second case, we get a new block of MDDS data every 10th block, which adds up really quickly.
(In reply to Noel Grandin from comment #9) > Ha :-) > > > Our data structures are very heavily optimised for continuous runs of data > of the same type. > > In this case, the first macro hits the heavily optimised path and runs very > fast. > The second macro introduces discontinuities (because every 10th element is a > different type), which pushes us off the optimised path. > > In the first case, we end up with one block of MDDS data, and in the second > case, we get a new block of MDDS data every 10th block, which adds up really > quickly. Noel, do you mean that it's not a bug?
(In reply to Roman Kuznetsov from comment #10) > > Noel, do you mean that it's not a bug? Well in was never a bug in the sense that LO is broken, we are doing the right thing, just not doing it very quickly. It is possible that it can improved, but that would require substantial work, and it will never be as fast as the first macro.
Possibly empty cells could be treated as not having different type compared to their neighbours? Typical use of large tabular data (where the performance problem happens) is to have same type of data in column; and empty value could be thought as a special case of value of any data type. And in case of strings, this is especially true, because while we could argue that empty is not a valid number (but still it would be useful to have special "empty" number for efficiency), empty string is perfectly valid thing.
Dear Mike Kaganski, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug