Bug 93392

Summary: Possible memory leak in Calc while linking to external html data file with auto update
Product: LibreOffice Reporter: Dennis Francis <dennisfrancis.in>
Component: CalcAssignee: Dennis Francis <dennisfrancis.in>
Status: RESOLVED FIXED    
Severity: major CC: dennisfrancis.in, dtardon, gmixaz, h3734236
Priority: medium    
Version: 5.1.0.0.alpha0+ Master   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard: target:5.1.0
Crash report or crash signature: Regression By:
Bug Depends on:    
Bug Blocks: 112071    
Attachments: This data file is a html file containing one table with 5 columns and 10,000 rows filled with random data.
Valgrind log
Complete valgrind log with call trace before fix is made
Valgrind log after the fix was applied.

Description Dennis Francis 2015-08-12 16:16:59 UTC
Created attachment 117865 [details]
This data file is a html file containing one table with 5 columns and 10,000 rows filled with random data.

How to reproduce ?

1. Open Calc after compiling from latest git master (5.1.0.0.alpha+)
2. Go to Sheet > Link to External Data
3. Open the attached file using browse button of the External Data dialog
4. Choose Automatic for "Select Language to use for import"
5. Select "HTML_1" in Available Tables/Ranges
6. Check the checkbox "Update every" and set the update interval to 5 seconds.
7. Click OK.
8. Now observe the memory usage of soffice.bin using your favorite tool.

What I observed :

The memory usage increases about 90 to 100 MB every time the auto update of data occurs.

Here is what I observe in top command for soffice.bin process after letting Calc run for more than 20 minutes.
  PID USER    PR NI   VIRT   RES    SHR  S  %CPU %MEM     TIME+ COMMAND
21136 dennis  20 0 13.866g 0.013t  20228 S  59.3 41.6  19:39.23 soffice.bin


PS: The attached data file is basically a html file containing one table with 5 columns and 10,000 rows filled with random data.
Comment 1 Dennis Francis 2015-08-13 09:56:56 UTC
Created attachment 117881 [details]
Valgrind log

I have attached the valgrind log that shows memory leak while one auto update takes place in Calc.
Comment 2 Mikhail Zemlyanukha 2015-08-13 13:23:58 UTC
FYI I've just submitted another bug which may have the same root cause: https://bugs.documentfoundation.org/show_bug.cgi?id=93409
Comment 3 Dennis Francis 2015-08-14 05:54:22 UTC
Created attachment 117910 [details]
Complete valgrind log with call trace before fix is made

I have attached a new valgrind log showing the leak without the "???" in the call trace. I had to disable dlclose at http://opengrok.libreoffice.org/xref/core/sal/osl/unx/module.cxx#221 to remove the "???" entries in the call trace as per http://valgrind.org/docs/manual/faq.html#faq.unhelpful
Comment 4 Dennis Francis 2015-08-14 06:11:45 UTC
Here is a patch that fixes the main leak that shows up in the valgrind log. But there are more smaller leaks that need to be fixed.


diff --git a/sc/source/filter/html/htmlpars.cxx b/sc/source/filter/html/htmlpars.cxx
index 178796b..57a6859 100644
--- a/sc/source/filter/html/htmlpars.cxx
+++ b/sc/source/filter/html/htmlpars.cxx
@@ -1966,6 +1966,17 @@ ScHTMLTable::ScHTMLTable(
 
 ScHTMLTable::~ScHTMLTable()
 {
+    // iterate through every table cell
+    ScHTMLEntryMap::iterator aMapIterEnd = maEntryMap.end();
+    for( ScHTMLEntryMap::iterator aMapIter = maEntryMap.begin(); aMapIter != aMapIterEnd; ++aMapIter )
+    {
+        ScHTMLEntryList& rEntryList = aMapIter->second;
+        while (!rEntryList.empty())
+        {
+            delete rEntryList.back();
+            rEntryList.pop_back();
+        }
+    }
 }
Comment 5 Dennis Francis 2015-08-14 06:14:11 UTC
Created attachment 117911 [details]
Valgrind log after the fix was applied.

This is the valgrind log after fixing the main leak. This shows some smaller leaks which I do not find a way to fix. Will contact the dev team.
Comment 6 Caolán McNamara 2015-08-14 10:11:45 UTC
Turned the patch into a gerrit submission as https://gerrit.libreoffice.org/#/c/17757/
Comment 7 Commit Notification 2015-08-17 09:22:40 UTC
Dennis P. Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6a03b8ba75c2746d0391a3eb7354b36539c0053f

tdf#93392 leak when linking to external html data file with auto update

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2015-08-25 15:12:16 UTC
Dennis Francis committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=20618ddc2793064c7c55d9c765a8113abb45b153

fixes a memory leak that appeared in tdf#93392 valgrind trace

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.