Bug 93392 - Possible memory leak in Calc while linking to external html data file with auto update
Summary: Possible memory leak in Calc while linking to external html data file with au...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.1.0.0.alpha0+ Master
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Dennis Francis
URL:
Whiteboard: target:5.1.0
Keywords:
Depends on:
Blocks: Calc-External-Datalink
  Show dependency treegraph
 
Reported: 2015-08-12 16:16 UTC by Dennis Francis
Modified: 2017-08-28 19:15 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
This data file is a html file containing one table with 5 columns and 10,000 rows filled with random data. (1.05 MB, text/html)
2015-08-12 16:16 UTC, Dennis Francis
Details
Valgrind log (409.56 KB, application/gzip)
2015-08-13 09:56 UTC, Dennis Francis
Details
Complete valgrind log with call trace before fix is made (512.85 KB, application/gzip)
2015-08-14 05:54 UTC, Dennis Francis
Details
Valgrind log after the fix was applied. (494.76 KB, application/gzip)
2015-08-14 06:14 UTC, Dennis Francis
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dennis Francis 2015-08-12 16:16:59 UTC
Created attachment 117865 [details]
This data file is a html file containing one table with 5 columns and 10,000 rows filled with random data.

How to reproduce ?

1. Open Calc after compiling from latest git master (5.1.0.0.alpha+)
2. Go to Sheet > Link to External Data
3. Open the attached file using browse button of the External Data dialog
4. Choose Automatic for "Select Language to use for import"
5. Select "HTML_1" in Available Tables/Ranges
6. Check the checkbox "Update every" and set the update interval to 5 seconds.
7. Click OK.
8. Now observe the memory usage of soffice.bin using your favorite tool.

What I observed :

The memory usage increases about 90 to 100 MB every time the auto update of data occurs.

Here is what I observe in top command for soffice.bin process after letting Calc run for more than 20 minutes.
  PID USER    PR NI   VIRT   RES    SHR  S  %CPU %MEM     TIME+ COMMAND
21136 dennis  20 0 13.866g 0.013t  20228 S  59.3 41.6  19:39.23 soffice.bin


PS: The attached data file is basically a html file containing one table with 5 columns and 10,000 rows filled with random data.
Comment 1 Dennis Francis 2015-08-13 09:56:56 UTC
Created attachment 117881 [details]
Valgrind log

I have attached the valgrind log that shows memory leak while one auto update takes place in Calc.
Comment 2 Mikhail Zemlyanukha 2015-08-13 13:23:58 UTC
FYI I've just submitted another bug which may have the same root cause: https://bugs.documentfoundation.org/show_bug.cgi?id=93409
Comment 3 Dennis Francis 2015-08-14 05:54:22 UTC
Created attachment 117910 [details]
Complete valgrind log with call trace before fix is made

I have attached a new valgrind log showing the leak without the "???" in the call trace. I had to disable dlclose at http://opengrok.libreoffice.org/xref/core/sal/osl/unx/module.cxx#221 to remove the "???" entries in the call trace as per http://valgrind.org/docs/manual/faq.html#faq.unhelpful
Comment 4 Dennis Francis 2015-08-14 06:11:45 UTC
Here is a patch that fixes the main leak that shows up in the valgrind log. But there are more smaller leaks that need to be fixed.


diff --git a/sc/source/filter/html/htmlpars.cxx b/sc/source/filter/html/htmlpars.cxx
index 178796b..57a6859 100644
--- a/sc/source/filter/html/htmlpars.cxx
+++ b/sc/source/filter/html/htmlpars.cxx
@@ -1966,6 +1966,17 @@ ScHTMLTable::ScHTMLTable(
 
 ScHTMLTable::~ScHTMLTable()
 {
+    // iterate through every table cell
+    ScHTMLEntryMap::iterator aMapIterEnd = maEntryMap.end();
+    for( ScHTMLEntryMap::iterator aMapIter = maEntryMap.begin(); aMapIter != aMapIterEnd; ++aMapIter )
+    {
+        ScHTMLEntryList& rEntryList = aMapIter->second;
+        while (!rEntryList.empty())
+        {
+            delete rEntryList.back();
+            rEntryList.pop_back();
+        }
+    }
 }
Comment 5 Dennis Francis 2015-08-14 06:14:11 UTC
Created attachment 117911 [details]
Valgrind log after the fix was applied.

This is the valgrind log after fixing the main leak. This shows some smaller leaks which I do not find a way to fix. Will contact the dev team.
Comment 6 Caolán McNamara 2015-08-14 10:11:45 UTC
Turned the patch into a gerrit submission as https://gerrit.libreoffice.org/#/c/17757/
Comment 7 Commit Notification 2015-08-17 09:22:40 UTC
Dennis P. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6a03b8ba75c2746d0391a3eb7354b36539c0053f

tdf#93392 leak when linking to external html data file with auto update

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2015-08-25 15:12:16 UTC
Dennis Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=20618ddc2793064c7c55d9c765a8113abb45b153

fixes a memory leak that appeared in tdf#93392 valgrind trace

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.