Bug 77439 - Changes for #77300 & #75741 result in very slow changes to range names
Summary: Changes for #77300 & #75741 result in very slow changes to range names
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf
Depends on:
Blocks:
 
Reported: 2014-04-14 15:56 UTC by tim
Modified: 2015-12-15 11:05 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tim 2014-04-14 15:56:51 UTC
Following on from fixes for #77300 & #75741, making changes to range names (insert, manage...) now works in the 4.2.4.0 daily builds from 12/04/14 onwards.

However, on a full sized complex spreadsheet, such changes now take minutes (about 2 minutes 10 seconds on one test) on a fairly quick machine (8 * 2.5GHz) running ubuntu 13.10.  I thought it had crashed/locked-up, but it did complete eventually.  One CPU was 100% for the whole time.

I strongly suggest an item in the release notes indicating that some may find such changes very much slower than before in complex sheets.

I can't supply a demo for this because on a small sheet without my data the performance is OK.  It becomes significant on a complex 90+ sheet with all my data, which is too private to supply.
Comment 1 Kohei Yoshida 2014-04-14 16:18:40 UTC
If at all possible, having a test document for this one would be great as well.
Comment 2 tim 2014-04-14 16:35:06 UTC
I currently can't see how I can do that with what I have, given that it's very personal financial data, so please don't expect one.

I think a comment has already been made on one of the two previous reports that the changed method of updating range names would be slow.
Comment 3 tim 2014-04-14 20:29:10 UTC
If I were to try and create a demonstration from scratch, can you give me some hints as to what could be taking so much time compared to the old method (that appeared to work in 4.1) and the method that failed to work in some versions of 4.2?

Is it just the number of cells with data in them, or cells with formula, or what?  Any ideas?
Comment 4 m_a_riosv 2014-04-15 23:02:40 UTC
Perhaps can be the cells/formulas with modified range names.

One thing you can try with a copy of the file, if it's the same issue changing any name or the issue is specially visible with a particular range name. Or deleting sheet by sheet those that haven't dependencies and trying.
Comment 5 tim 2014-04-16 15:57:37 UTC
As I reduce the number of charts and sheets, mostly unrelated to the range name I am changing, the time taken to make the change reduces.

Before this fix, and in earlier versions of Calc which didn't have the fault that has now been fixed, there was no significant delay when changing the range of a range name.  

I'm sorry I can't (yet) provide a demo, due to the very private nature of the data in my real sheets.  I have been trying, but without success so far.  

It ought to be possible to look at the change that has been made and get some sort of idea about why it might take a really noticeably long time with a complex set of sheets with many inter-relationships between the cells and with many range names.
Comment 6 tim 2014-04-16 19:42:46 UTC
Three things that the problem doesn't appear to be related to:

1) I turned off auto-recalculate.  It's still very slow (I had hopes of this, but to no avail)
2) I deleted all other range names (causing many cell errors).  It's still very slow.
3) I deleted around 30 sheets containing graphs of data from other sheets.  It's very slightly quicker (maybe 2 minutes instead of 2 minutes and several tens of seconds).

In all cases one cpu was 100% busy the whole time.

So as far as I can tell so far the problem is related to the total number of cells and sheets with data in them.
Comment 7 tim 2014-04-22 13:24:19 UTC
Kohei, can I ask (beg) that you look at the changes you made and try to work out why they would cause changes to range names take very, very much longer than they used to.

I have spent many, many hours trying to reproduce the problem on a new sheet without success, mainly because I have no idea at all what has changed in LO so I don't know what I need to recreate.

Prior to the bug being introduced in LO 4.2.x, modifications to range names were almost instant.  Now the bug is fixed they take over 2 minutes.  What has changed to make this happen?  It must be possible to at least make some educated guesses (as opposed to my ignorant ones).

I have tried to de-personalise my main data sheet without much success.  It's really rather difficult to do, for reasons I don't want to explain on a public website.  I will keep trying, but can't tell if I will be successful.

I am trying to assist in making LO better, but I need Help!
Comment 8 Kohei Yoshida 2014-04-22 14:22:44 UTC
So, the difficulty with tackling a performance issue is that each test document is unique, has a unique profile etc.  Your test document obviously has a trait that triggers the poor performance, and I don't know what it is.

Having said that, the operation only handles formula cells and range names, so if you can perhaps cleanse the original document and remove all cells but formulas, while leaving the formula cells intact, perhaps you could come up with a usable test document without exposing your confidential content.

I could of course try to create an artificially inflated document but such document may not share the same trait as your document.  So, at the very least we would need to know how the data (esp the formula cells) are laid out in your document in order to try to build a brand-new test document from scratch.
Comment 9 tim 2014-04-22 14:37:36 UTC
I'm trying, but it's taking forever.  I've tens of thousands of data items to be checked and erased.  This is non trivial. 

I'm a bit disappointed that there is no clue as to the problem from the nature of the change you made.  Surely there's something about it that can be deduced.  If method a takes seconds, and method b takes minutes, isn't there something about method b that is a little different? If I didn't know better I'd guess it is looking at each and every cell in the sheet and recalculating it.
Comment 10 Kohei Yoshida 2014-04-22 14:46:49 UTC
And the last point, this software is a huge, complex, unbelievably intricate network of hundreds of components meshed into one.  So, just by "looking at the change you made" won't fix the problem.  You seem to imply that "go back to the stuff you did and fix it!" is an appropriate thing to say, but it actually irks me to hear things like that.  Things aren't that simple.
Comment 11 tim 2014-04-22 14:55:18 UTC
Well I apologise if I've irritated you, that's not my intention.  

However, I am getting pretty frustrated myself.  I'm really trying to help LO be better and fit for all types of use, but I'm still stuck with almost no way out except the hours I've already put in, and who knows many many more attempting to demonstrate what I can see in front my eyes all the time.

I spent 35 years developing software, and checking performance issues, so although I don't know LO internals at all, I've some idea of complexity.  One of the things I don't like about newer design techniques is that it's extremely hard to know what is actually going on, since one just uses 'an object' without detailed knowledge as to what and how it does something, what else it uses, and 'unintended consequences'.  So I sympathise, but am still struggling to help.
Comment 12 m_a_riosv 2014-04-22 15:55:26 UTC
Hi Tim, you are claiming for a solution, all we want it, but if you can't help to find out it, in what you can and it's needed, how to do it?.

I think the only way for Kohei to know what is going on, it's to have a file where to see the issue.

IMO you can have the confidence for sending directly the file to Kohei (I hope he can accept this solution), so not doing it public, perhaps a so deep depersonalization it's no needed from your side.

If you don't, as you have mentioned there are a lot of graphs, try deleting them, beginning for those with larger data set.
Comment 13 tim 2014-04-23 07:35:38 UTC
I've sent a demo sheet to Kohei.
Comment 14 Kohei Yoshida 2014-05-05 13:40:42 UTC
I did try

1) open the file from Tim,
2) Ctrl-F3 to open the Name Manager dialog,
3) Change the last row of AcBalances from $H$36 to $H$37,
4) Hit OK.

There is a long pause, for sure, but the duration was about 3 to 5 seconds, which is nowhere near what was originally described (i.e. minutes).
Comment 15 tim 2014-05-05 15:05:36 UTC
I have just downloaded and installed the current nightly build (libreoffice-4-2~2014-05-03_06.40.47_LibreOfficeDev_4.2.5.0.0_Linux_x86-64_deb.tar.gz  )

No change - still takes 2 over 2 minutes on ubuntu, as it did under windows.  

I then downloaded the 2.4.4.1 pre-release for Windows, and that took a few seconds.  Strange.

So I bit the bullet, discarded the ubuntu ppa version of LO for now, and downloaded the generic linux 2.4.4.1 pre-release.

It is still taking two minutes.  I'll try the windows version again (on the same PC), but this is very odd.
Comment 16 tim 2014-05-05 15:18:55 UTC
I have repeated the tests using the sheet I sent you (TestCalc3) with 4.2.4.1 pre-release on Windows and Ubuntu (14.04).

On the Windows pre-release it is now OK (which it was not using the older nightly build).

On the Ubuntu pre-release it still takes minutes, locking up one CPU.

These run on the same computer.
Comment 17 tim 2014-05-05 15:20:30 UTC
Apologies for my typos in comment #15.  For  2.4.4.1 read 4.2.4.1 !
Comment 18 Kohei Yoshida 2014-05-05 19:48:35 UTC
Any extensions you may have installed that may be triggering the drag?
Comment 19 tim 2014-05-05 20:09:04 UTC
(In reply to comment #18)
> Any extensions you may have installed that may be triggering the drag?

I haven't changed any extensions from versions (prior to the original bug) that ran fine, to current.

Installed extensions are:

Alternative searching
English, French & Spanish dictionaries
MRI-UNO Object inspection tool (used for debugging macros)
Solver for non-linear programming
Wiki Publisher

Most of these are installed by default and can't be removed (only Alternative searching and MRI can be).  I disabled these two, but nothing changed.

This has gone from a problem common to Windows and Ubuntu in the nightly builds of 4.2.4.1, to a problem only affecting Ubuntu in the pre-release.  How strange is that?

Can I ask whether you are able to run the sheet against an earlier nightly build of 4.2.1.4 around April 14th, when the problem affected both OSs?  

Is there any trace or debug I can turn on that would show what a CPU is doing for > 2 minutes at 100%
Comment 20 m_a_riosv 2014-05-05 20:12:06 UTC
Please try resetting the profile in Ubuntu:
https://wiki.documentfoundation.org/UserProfile#GNU.2FLinux
sometimes solves strange issues.profi
Comment 21 tim 2014-05-05 20:33:28 UTC
I tried, it made no difference.  I thought it wouldn't because when running the nightly builds they used a new (empty) profile, but anything is worth a try.
Comment 22 tim 2014-05-05 21:00:08 UTC
The range in question covers a pivot table that links to a registered Base database (and thence to a MariaDB database).

When I run under windows I cannot link to the database since this runs under ubuntu on the same PC as an alternate boot.

So I have tried de-registering the database, and deleting the underlying pivot table, in an effort to make my system more like yours, and mine under Windows, to no effect as yet.  

I'll keep trying to think of what else might be different.
Comment 23 tim 2014-05-06 07:18:55 UTC
The spreadsheet is based on a fairly ancient Excel sheet.  I have a suspicion that there is some old baggage around inside it (eg there are placeholders for macros that no longer exist, but which cannot be deleted).

So I'm going to rebuild my spreadsheet from scratch by copying all the sheets and then changing the links to be local.

I'll report here as soon I know whether this has made a difference.
Comment 24 tim 2014-05-06 11:58:46 UTC
I copied all the sheets to a new ods, recreated all the pivot tables (since they did not copy over).  Many references were now links to the original as a result, so I changed all those to be local to the new file.

The problem has now gone away.  

So something in the sheet (and its derivative which I sent to Koshei) was causing delays in some, but not all environments.

Since no one else seems to be able to reproduce this, I of course accept that whatever the problem is, it is unlikely to be resolved, and will probably never happen to anyone else, since it may be that the sheet itself is corrupt in some way.

I've therefore closed this report as 'not a bug'.

I now have a working set of sheets, so am very happy :)

Thanks to all who have spent time on this.  Apologies if some of it seems to have been wasted.
Comment 25 Robinson Tryon (qubit) 2015-12-15 11:05:09 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]