Bug 111461 - XLSX with (unseen/unused?) images takes forever to load and produces gigabytes of temp files
Summary: XLSX with (unseen/unused?) images takes forever to load and produces gigabyte...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium critical
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: perf, preBibisect, regression
Depends on:
Blocks: XLSX-Images
  Show dependency treegraph
 
Reported: 2017-08-07 19:28 UTC by Damien Chambe
Modified: 2023-08-19 03:19 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
XLSX with pictures hangs libreoffice (205.57 KB, application/wps-office.xlsx)
2017-08-07 19:30 UTC, Damien Chambe
Details
File cleaned with WPS Office by removing all hidden pictures in sheet 1 (12.59 KB, application/wps-office.xlsx)
2017-08-07 19:31 UTC, Damien Chambe
Details
Original edited by MSO 2013 (244.09 KB, application/wps-office.xlsx)
2017-08-08 09:21 UTC, Damien Chambe
Details
strace on loading original XLSX (481.21 KB, application/x-bzip)
2017-08-22 15:59 UTC, Damien Chambe
Details
Flamegraph (76.75 KB, application/x-bzip)
2020-09-17 19:00 UTC, Julien Nabet
Details
111461_bt.log: partial debug log showing many identical imports, with BackTrace. (29.88 KB, text/x-log)
2021-08-17 08:47 UTC, Justin L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Damien Chambe 2017-08-07 19:28:11 UTC
Description:
This hang.XLSX produced by MS Office (unknown version as it has been sent by a customer by mail) hangs LibreOffice.
In /tmp, a folder is created lu188442iogk4.tmp with a lot of file such as lu188442ioglh.tmp . At least 6 gb are produced. (after that my disk was full )

Opening with WPS Office, I have detected a hundred of images inserted in the first sheet, near cell O4, with either a 0px height or a 0px width. If I delete those images, the file is correctly opened by LibreOffice, see the other file clean.xlsx


Version: 5.4.0.3
Build ID: 1:5.4.0~rc3-0ubuntu0.16.04.1~lo1
Threads CPU : 4; OS : Linux 4.10; UI Render : par défaut; VCL : gtk3; 
Locale : fr-FR (fr_FR.UTF-8); Calc: CL

And reproduced on another ubuntu installation with Libroffice 5.3.5 on linux/ubuntu

Steps to Reproduce:
1. Open hang.XLSX
2. Watch for a folder created in tmp folder, a lot of tmp files are created ( > 6gb for a 210 kb worksheet file )
3.

Actual Results:  
LibreOffice hangs and will probability fill your disk with tmp files. This could prevent the OS to boot normally. Some tmp files are pictures, other are unknown binary content.

Expected Results:
Open the file, or at least warn the user that is can't fully open the document


Reproducible: Always

User Profile Reset: No

Additional Info:


User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
Comment 1 Damien Chambe 2017-08-07 19:30:20 UTC
Created attachment 135237 [details]
XLSX with pictures hangs libreoffice
Comment 2 Damien Chambe 2017-08-07 19:31:03 UTC
Created attachment 135238 [details]
File cleaned with WPS Office by removing all hidden pictures in sheet 1
Comment 3 Julien Nabet 2017-08-08 06:57:01 UTC
Just to be clear, the second attachment is xlsx cleaned with WPSOffice but about the first attachment, is it the original xslx file from MsOffice or is it the result of saving in LO 5.3.5.2?
In the second case, could you attach the original xlsx file from your customer?
Comment 4 Damien Chambe 2017-08-08 09:21:15 UTC
Created attachment 135265 [details]
Original edited by MSO 2013
Comment 5 Damien Chambe 2017-08-08 09:25:15 UTC
(In reply to Julien Nabet from comment #3)
> Just to be clear, the second attachment is xlsx cleaned with WPSOffice but
> about the first attachment, is it the original xslx file from MsOffice or is
> it the result of saving in LO 5.3.5.2?
> In the second case, could you attach the original xlsx file from your
> customer?

Here's the original customer file, untouched by Libreoffice nor WPS office.
It was edited by the customer with MSO 2013.
Comment 6 Julien Nabet 2017-08-08 09:29:13 UTC
Thank you for your feedback.
I'll take some time to confirm it after my day time job.
But certainly someone may give it a try before.
Comment 7 Julien Nabet 2017-08-12 06:01:08 UTC
On pc Debian x86-64 with master sources updated yesterday, I could reproduce this.

I noticed this repeated trace on console:
warn:oox:5947:1:oox/source/drawingml/shapecontext.cxx:126: ShapeContext::onCreateContext: unhandled element: 3971
Comment 8 Damien Chambe 2017-08-22 15:59:49 UTC
Created attachment 135736 [details]
strace on loading original XLSX

strace from master source  cc2cb0123ac599bf25c5e17b97b5d7bf93d3e487 
Tue Aug 22 13:38:03 2017 +0200
Comment 9 Telesto 2017-08-22 18:31:35 UTC
Repro with:
Versie: 4.4.6.3 
Build ID: e8938fd3328e95dcf59dd64e7facd2c7d67c704d
Locale: nl_NL

Setting to: All -> Windows also affected
Comment 10 XTR 2017-12-06 14:08:22 UTC
Repro with:
Version: 6.0.0.0.beta1 (x64)
Build ID: 97471ab4eb4db4c487195658631696bb3238656c
CPU threads: 4; OS: Windows 6.1; UI render: default; 
Locale: ru-RU (ru_RU); Calc: CL

and with
Apache OpenOffice 4.1.4
Comment 11 paulystefan 2018-06-21 23:09:44 UTC
i made a test with gnumeric 1.12.17 in windows

it reads the mso-file fast.

the new saved gnumeric xml-file is about 189 MB. wow!!!

save as a ods 1.2 extended creates a file with 182 MB.

about factor 800 in relation to source.

A test with actual gnumeric on linux can show more.

reading of this gnumeric export in LOO 5.4.7 and in LOO 6.0.5.1 was not successful

Also gnumeric is with some problems here, but gnumeric makes it.
Comment 12 paulystefan 2018-08-25 14:11:56 UTC
same in 6.1.0.3 x64 in win10-64
Comment 13 QA Administrators 2019-09-02 09:25:43 UTC Comment hidden (obsolete)
Comment 14 Damien Chambe 2019-09-02 20:28:25 UTC
Reproduced with :
Version: 6.3.1.2
Build ID: 1:6.3.1~rc2-0ubuntu0.18.04.1~lo1
Threads CPU : 4; OS : Linux 5.0; UI Render : par défaut; VCL: gtk3; 
Locale : fr-FR (fr_FR.UTF-8); Langue IHM : fr-FR
Calc: CL
Comment 15 Damien Chambe 2019-09-02 20:36:25 UTC
NOT reproduced with 3.3 version
Comment 16 Julien Nabet 2019-09-16 20:38:06 UTC
On pc Debian x86-64 with master sources updated today, I noticed these logs:
warn:legacy.osl:3832:3832:oox/source/helper/graphichelper.cxx:120: GraphicHelper::GraphicHelper - cannot get target frame
warn:oox:3832:3832:oox/source/drawingml/shapecontext.cxx:130: ShapeContext::onCreateContext: unhandled element: 3973
warn:vcl.gdi:3832:3832:vcl/source/graphic/Manager.cxx:141: Calculated size mismatch. Variable size is '28365646' but calculated size is '19039954'

(for the third one, I used:
diff --git a/vcl/source/graphic/Manager.cxx b/vcl/source/graphic/Manager.cxx
index ec2bdca9be0b..941fb45bd9b8 100644
--- a/vcl/source/graphic/Manager.cxx
+++ b/vcl/source/graphic/Manager.cxx
@@ -138,7 +138,7 @@ void Manager::registerGraphic(const std::shared_ptr<ImpGraphic>& pImpGraphic,
 
     if (calculatedSize != mnUsedSize)
     {
-        SAL_INFO_IF(calculatedSize != mnUsedSize, "vcl.gdi",
+        SAL_WARN_IF(calculatedSize != mnUsedSize, "vcl.gdi",
                     "Calculated size mismatch. Variable size is '"
                         << mnUsedSize << "' but calculated size is '" << calculatedSize << "'");
         mnUsedSize = calculatedSize;
)
Comment 17 Julien Nabet 2019-09-16 20:46:19 UTC
Tomaz: following error quoted in my previous comment, I took a look to git history of the file and thought you might be interested in this one.
(of course, I may be wrong, so don't hesitate to uncc yourself in this case).
Comment 18 Commit Notification 2020-04-24 08:04:51 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/7e82d7bf88f63b1dcd9da939904330054fc426f6

Related tdf#111461: ignore picLocks attribute

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Julien Nabet 2020-04-24 08:12:06 UTC
Sorry my patch concerns only a warning, it doesn't fix anything.
=> remove target:7.0.0
Comment 20 Commit Notification 2020-04-29 10:03:46 UTC
Julien Nabet committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/768882d4e9b955ee1b79ad39cb4232789d0d5ee9

Related tdf#111461: add "variant", "lpstr" and "i4" in docprophandler (oox)

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 21 Julien Nabet 2020-04-29 10:08:10 UTC
Again another patch related to this bugtracker but it doesn't fix it.
=> remove target
Comment 22 Buovjaga 2020-06-09 18:00:34 UTC
Already in 3.5.0 (tested on Win), so can not be bibisected
Comment 23 Julien Nabet 2020-09-17 19:00:54 UTC
Created attachment 165639 [details]
Flamegraph
Comment 24 Xisco Faulí 2020-11-20 14:29:10 UTC
Still reproducible in

Version: 7.1.0.0.alpha1+
Build ID: 2f7b5634487ac3d27777ab12a57089e71ea5216d
CPU threads: 4; OS: Linux 5.7; UI render: default; VCL: gtk3
Locale: en-US (en_US.UTF-8); UI: en-US
Calc: threaded
Comment 25 Xisco Faulí 2020-11-20 14:30:31 UTC
(In reply to Julien Nabet from comment #23)
> Created attachment 165639 [details]
> Flamegraph

@Noel, I thought you might be interested in this issue
Comment 26 Roman Kuznetsov 2021-05-05 10:42:24 UTC
still rerpo in

Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 931e264590100c555580c413556e229a0f03316a
CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: threaded
Comment 27 Justin L 2021-08-17 08:47:31 UTC
Created attachment 174340 [details]
111461_bt.log: partial debug log showing many identical imports, with BackTrace.

repro 7.3+. Lots of 8MB temp files, and seemingly infinite loop. The good thing is that memory consumption seemed stable at 600MB for the 15 minutes I tested using comment 4's original_Excel.xlsx.

My guess is that there is a problem in sax/source/fastparser because it seems to generate events that never end. The same graphic object seems to be loaded in a loop of
     while (!rEntity.maPendingEvents.empty())

But I have no idea how to debug a generic something that just emits events into a queue, and all the time is spent evaluating the queue events.
Comment 28 Noel Grandin 2021-08-17 08:53:26 UTC
This file is essentially broken, so I'm not interested in fixing it (however, other people are welcome to try).

LibreOffice is spending a ton of time applying filtering specified by the XLS file to those images (which are not actually zero sized).

Refactoring the code to avoid doing that conversion would be quite a chunk of work.
Comment 29 Justin L 2021-08-18 06:01:16 UTC
(In reply to Noel Grandin from comment #28)
> This file is essentially broken,
I was tempted to reduce the importance from critical because of this, but I expect that the same could be true in a legitimate document, so I left it for now.
Comment 30 QA Administrators 2023-08-19 03:19:41 UTC
Dear Damien Chambe,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug