Bug 30770 - xslx import/export takes hours, while UI freezes
Summary: xslx import/export takes hours, while UI freezes
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.0.1.2 release
Hardware: All All
: high major
Assignee: Tobias Lippert
URL:
Whiteboard: target:4.3.0
Keywords: perf
: 56259 68519 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-10-11 07:39 UTC by Kohei Yoshida
Modified: 2015-12-15 11:38 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Test case - XLSX file that takes 'forever' to open (493.74 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2012-02-15 00:31 UTC, Danilo Godec
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kohei Yoshida 2010-10-11 07:39:35 UTC
I have received a confidential document from a user that he says takes 2.5 hours to open.  I have the document on my local machine.
Comment 1 Daniil Bratashov 2010-10-22 13:28:46 UTC
Same problem with some .pptx presentations. It is correctly rendered, it is not so large file and not so large images, but takes too long to open.
Comment 2 Daniil Bratashov 2010-12-06 11:16:47 UTC
(In reply to comment #1)
> Same problem with some .pptx presentations. It is correctly rendered, it is not
> so large file and not so large images, but takes too long to open.

Seems that my case is connected with graph content inside. All files that take too long to open are either containing embedded Excel graph with large set of data inside or some non-trivial calculations on resonable-sized datafield.
Comment 3 Danilo Godec 2012-02-15 00:31:35 UTC
Created attachment 57069 [details]
Test case - XLSX file that takes 'forever' to open

When I try to open this XLSX file with LibreOffice 3.4.2.6 on OpenSuSE 11.4, it becomes unresponsive and it's CPU consumption is 100%.

'strace' of the 'soffice.bin' process in that state gives just this:

poll([{fd=13, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=13, revents=POLLOUT}])
writev(13, [{"7\30\5\0{\347@\3z\347@\3\0\0\1\0\0\0\0\0008\315\5\0{\347@\3\4\0\10\0"..., 16376}, {NULL, 0}, {"", 0}], 3) = 16376
read(13, 0x70b494, 4096)                = -1 EAGAIN (Resource temporarily unavailable)

poll([{fd=13, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=13, revents=POLLOUT}])
writev(13, [{"7\30\5\0\337\350@\3\336\350@\3\0\0\1\0\0\0\0\0008\315\5\0\337\350@\3\4\0\10\0"..., 16376}, {NULL, 0}, {"", 0}], 3) = 16376
read(13, 0x70b494, 4096)                = -1 EAGAIN (Resource temporarily unavailable)

These patterns come up in about 0.5 second intervals and seem to go on 'forever' (I waited for more then 15 minutes, but finally gave up and killed the 'soffice.bin' process).

I then opened the file using an older version of MS Excel (2003) and it too came up with warnings - first it said there are some 'formats that are not supported' and then it said 'too many cell formats' and finally it said that it had to remove some formatting. However, the data seems to be all there - there are 3 columns and 2468 rows of simple data - numbers and text, no formulas.

So I saved the file in 'old' XLS format and tried to open that in LibreOffice (after killing the previous 'hung' instance) - in the beginning the 'soffice.bin' again had 100% CPU consumption for about 5-10 minutes. In that time  'strace' showed this:

brk(0x4c4b000)                          = 0x4c4b000
brk(0x4c6c000)                          = 0x4c6c000
brk(0x4c8d000)                          = 0x4c8d000
brk(0x4cae000)                          = 0x4cae000
brk(0x4ccf000)                          = 0x4ccf000

After about 10 minutes, there was a burst of 'strace' activity and the LibreOffice 'tab' started showing the file name - however it still didn't open and 'strace' again showed those 'brk' lines.

So I waited another 5 minutes and then it finally opened. The first thing I did was to save the file in 'ODS' - but again, the whole thing crawled to a hold and LibreOffice became unresponsive with 100% CPU usage.

I didn't have the patience to wait for the 'save' to finish...
Comment 4 ringe 2013-03-08 19:22:44 UTC
*** Bug 56394 has been marked as a duplicate of this bug. ***
Comment 5 ringe 2013-03-08 19:27:25 UTC
*** Bug 56259 has been marked as a duplicate of this bug. ***
Comment 6 ringe 2013-03-08 21:56:13 UTC
Changing title to a more descriptive one. I recommend checking out the duplicates also, for more details.

I don't know if bug 61721 is another duplicate or not, but it seems so.
Comment 7 ringe 2013-03-14 11:10:29 UTC
I configured LibreOffice for less memory usage:
http://oldpapyrus.wordpress.com/2012/06/28/reduce-libreoffice-memory-usage/

Then I deployed the settings to all users:
http://community.spiceworks.com/scripts/show/1859-configure-libreoffice-for-all-users

I have not heard any complaints after this. That doesn't mean the issue is solved. I just have not been able to reproduce it on the test cases I had.

The test case provided in this ticket is still valid for this bug.
Comment 8 Samuel Mehrbrodt (CIB) 2013-04-27 16:10:57 UTC
This Issue is being sponsored: http://www.freedomsponsors.org/core/issue/212/certain-xlsx-file-takes-25-hours-to-open

Whoever fixes this gets a nice bounty.
Comment 9 ign_christian 2013-08-26 01:57:00 UTC
*** Bug 68519 has been marked as a duplicate of this bug. ***
Comment 10 Zeki Bildirici 2014-01-01 12:07:23 UTC
Same with Version: 4.2.0.1
Build ID: 420m0(Build:1)

However, it opens instantly with Gnumeric 1.12.6 after an error screen says:
"
Encountered uninterpretable "ext" extension in namespace "{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}"
"

Can be opened with Calligra Sheets in 5-6 secs without any error warnings.

Best regards,
Zeki
Comment 11 Samuel Mehrbrodt (CIB) 2014-01-01 13:13:38 UTC
Might have to do with the styles.xml file in the xlsx file which is 6.8M big.
Comment 12 Commit Notification 2014-03-11 13:55:36 UTC
Tobias Lippert committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0c17ccc493d0c7a80f37600dae76a09a119bef78

fdo#30770 - Speed up xslx import



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 13 Commit Notification 2014-03-13 11:19:49 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=eee4c914aee9794125077d4ae7c6dd171b8fb223

Related: fdo#30770 fix rtf cut/paste crash



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Samuel Mehrbrodt (CIB) 2014-03-28 07:48:12 UTC
I tested this with a recent master build and it opens in a few seconds. Nice work, Tobias.
You can claim your bounty here: http://www.freedomsponsors.org/core/issue/212/certain-xlsx-file-takes-25-hours-to-open
Comment 15 sergio.callegari 2014-03-28 17:38:38 UTC
Looking forward to testing it on http://www.elsevier.com/__data/assets/excel_doc/0003/148548/title_list.xlsx !

One note to mention that not only opening is a problem... also saving is.

The above document actually opens with LiBO 4.2.3 RC 2 (which I guess is still without the fix). It takes about 3 minutes on a Haswell 4th gen laptop with 16 GB RAM, but then it is impossible to save it in ods! Tried and while I'm writing this it has been going on for 30' at 100% cpu on the same machine and great fan spinning and the saving bar is still at 0%.

I'll try the fixed version and in case I'll open a new bug for the saving part.
Comment 16 Samuel Mehrbrodt (CIB) 2014-03-28 18:12:35 UTC
Sergio, you can download a preversion of LibreOffice 4.3 here: http://dev-builds.libreoffice.org/daily/master/win-x86@39/current/

I tested your document from Comment 15 and it opens in less than 30 seconds.
However, saving seems to be a problem. You should open a new bug for that.
Comment 17 Tobias Lippert 2014-03-28 20:30:40 UTC
Hello Sergio,
if you create a bug for the xlsx export, feel free to assign it to me. I should be able to reuse most of the work from the import.
I will start working on it after I finish my current bug (which might take some time because I only have so many spare time, but I will eventually come to it.)
Tobias
Comment 18 Robinson Tryon (qubit) 2015-12-15 11:38:59 UTC
Migrating Whiteboard tags to Keywords: (perf)
[NinjaEdit]