Bug 73294 - Segmentation fault in libsvllo.so
Summary: Segmentation fault in libsvllo.so
Status: RESOLVED WORKSFORME
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.1.4.2 release
Hardware: Other Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords: haveBacktrace
Depends on:
Blocks:
 
Reported: 2014-01-05 03:09 UTC by Jim Avera
Modified: 2016-09-19 16:48 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Test data - must be installed as /tmp/tickers.csv (2.71 KB, text/csv)
2014-01-05 03:09 UTC, Jim Avera
Details
j.ods - spreadsheet demonstrating the crash (333.89 KB, application/vnd.oasis.opendocument.spreadsheet)
2014-01-05 03:09 UTC, Jim Avera
Details
bogus (please ignore) (113.93 KB, text/plain)
2014-01-06 19:25 UTC, Jim Avera
Details
Stack traceback, etc. (from appport crash report). (163.85 KB, text/plain)
2014-01-06 19:30 UTC, Jim Avera
Details
j2.ods -- updated demo spreadsheet (instructions in comment #9) (144.85 KB, application/x-vnd.oasis.opendocument.spreadsheet)
2014-08-19 00:38 UTC, Jim Avera
Details
gdb traceback using j2.ods and LO 4.3.2.0.0+ build 2014-08-17_22:48:01 (39.76 KB, text/plain)
2014-08-19 00:40 UTC, Jim Avera
Details
j2.ods -- update#3 of demo spreadsheet (instructions in comment #9) (81.45 KB, application/x-vnd.oasis.opendocument.spreadsheet)
2014-08-19 06:46 UTC, Jim Avera
Details
valgrind log showing conditional jump depending on uninit data (353 bytes, text/plain)
2014-12-13 08:22 UTC, Jim Avera
Details
valgrind log showing conditional jump depending on uninit data (10.82 KB, text/plain)
2014-12-13 08:23 UTC, Jim Avera
Details
Valgrind trace with master sources (8.60 KB, application/x-bzip)
2014-12-13 15:54 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Avera 2014-01-05 03:09:06 UTC
Created attachment 91507 [details]
Test data - must be installed as /tmp/tickers.csv

Problem description: 

The attached spreadsheet has some complicated Basic macros but they worked in LO 3.6.x, but crash the process using LO 4.2.0.1 (silent abort, followed by an apport run).

I traced it down to a call to the Sort() function, which
calls a UNO function to sort some rows.  Here is the relevant
Basic code (which seems to abort only in context):

   
   dim args() as new com.sun.star.beans.PropertyValue
   sCellrangeName as String (contains "$A$8:AC47")

   oCellRange = oSheet.getCellRangeByName(sCellrangeName)

   oCellRange.Sort(args())  'never returns, process aborts

Steps to reproduce (Unix only, sorry):
1. Download the attached "tickers.csv" into /tmp/tickers.csv
    (this path is hard-coded in the demo program)
2. Open the attached j.ods spreadsheet.  Enable macros.
3. (Click the Design Mode ON/OFF icon if necessary to exit
    design mode -- working around bug 73293)
4. Click the yellow "Reload tickers.csv" button 
Click through a few MsgBox displays.

If it does not crash, repeat - it always crashes by the 2nd try.

The code is located in Basic macro module j.ods -> Standard -> SortUtils (search for "DEBUG BEFORE sort")

Current behavior: Fatal process abort

Expected behavior: Should work or give a useful error message

              
Operating System: Ubuntu
Version: 4.1.4.2 release
Comment 1 Jim Avera 2014-01-05 03:09:49 UTC
Created attachment 91508 [details]
j.ods - spreadsheet demonstrating the crash
Comment 2 Jim Avera 2014-01-06 19:11:47 UTC
apport collected the following about the crash:
Comment 3 Jim Avera 2014-01-06 19:25:37 UTC
Created attachment 91553 [details]
bogus (please ignore)
Comment 4 Jim Avera 2014-01-06 19:30:17 UTC
Created attachment 91554 [details]
Stack traceback, etc. (from appport crash report).
Comment 5 Joel Madero 2014-01-31 15:28:08 UTC
I don't get a full blown crash but I get a lock up after the first run. That being said, requesting expert advice on this one as I don't know if it's LibreOffice or the macro causing the issue.

Noel - should we push this one to NEW?
Comment 6 raal 2014-08-04 19:18:03 UTC
Tested with Version: 4.3.1.0.0+
Build ID: ca88a0ea6ed7277e8522d83458e3cfb975fcfb7d
TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:libreoffice-4-3, Time: 2014-07-25_10:06:25

Error message:
Inadmissible value or data type.
Division by zero.

Calc didn't crash.
Comment 7 Jim Avera 2014-08-04 21:12:16 UTC
Any indication of the location of the divide by zero?   

I see a crash (spontaneous process abort) inside the built-in Sort function, no in user code. 

Version: 4.3.1.0.0+
Build ID: 0ad283adb51b3a1bb777e6341e61541d4bffaa44
TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:libreoffice-4-3, Time: 2014-07-21_07:30:30
Comment 8 raal 2014-08-07 19:50:43 UTC
(In reply to comment #7)
> Any indication of the location of the divide by zero?   

 if 100*(Stock-Strike)/Stock > MaxPctITM then

Stock = 0
Comment 9 Jim Avera 2014-08-19 00:37:18 UTC
I have no clue why Raal gets a macro error (divide by zero) and not the crash I see.   Please confirm that the "Steps to reproduce" were performed exactly, as there are indeed Basic macro bugs which can be found -- but irrelevant to LO crashing.

I'm going to attach a new demo spreadsheet (j2.ods) and a new backtrace.

This newer spreadsheet has cleaned-up macro code and runs reliably on OO4.1
With LO 4.3.2 (build date 8/18/2014) it gets a SIGSEGV in libsvllo.so (see the backtrace).

Using latest 4.3.2 build from 8/18/2014.

Please try this sequence:

1.  Download tickers.csv and move to /tmp (required by the macro code)
2.  Download j2.ods and run
      /path/to/libreofficedev4.3.2.0.0 --backtrace j2.ods
3.  Click the yellow button and wait for screen refresh
    Click the yellow button again
    Click the yellow button again (crashes on the 3rd time)

Thanks!
Comment 10 Jim Avera 2014-08-19 00:38:24 UTC
Created attachment 104848 [details]
j2.ods -- updated demo spreadsheet (instructions in comment #9)
Comment 11 Jim Avera 2014-08-19 00:40:07 UTC
Created attachment 104849 [details]
gdb traceback using j2.ods and LO 4.3.2.0.0+ build 2014-08-17_22:48:01
Comment 12 raal 2014-08-19 06:33:43 UTC
Hello Jim,
now I can reproduce crash with Version: 4.3.2.0.0+
Build ID: 25459cb0c9afdf46c3d90ae8ba0b6ffb375f67da
TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:libreoffice-4-3, Time: 2014-08-17_22:48:01

I can reproduce crash with Version: 4.4.0.0.alpha0+
Build ID: e379401618268ed7f7f5885a36b90e1f4f6cd4af
TinderBox: Linux-rpm_deb-x86_64@46-TDF, Branch:master, Time: 2014-08-18_05:51:03

Please repair j2.ods- CSV file path doesn't containt /tmp/tickers.csv  (Proto_csvpath does not exist (/home/jima/Jts/tickers.csv;/home/jima/IBJts/tickers.csv;\\Vboxsvr\root\home\jima\Jts\tickers.csv;\\Vboxsvr\root\home\jima\IBJts\tickers.csv)).

Setting as new.
Comment 13 Jim Avera 2014-08-19 06:46:23 UTC
Created attachment 104861 [details]
j2.ods -- update#3 of demo spreadsheet (instructions in comment #9)

Replacing j2.ods with version which looks in /tmp for the tickers.csv file
(the previous version looked in places used in production)
Comment 14 Julien Nabet 2014-12-12 23:41:48 UTC
On pc Debian x86-64 with master sources updated today, I got a crash at first click, here's main part of bt:
#3  0x00002aaaaf5fd639 in __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<SvtListener**, std::__cxx1998::vector<SvtListener*, std::allocator<SvtListener*> > >, std::__debug::vector<SvtListener*, std::allocator<SvtListener*> > >::_Safe_iterator (this=0x7fffffff0180, 
    __i=<error reading variable: Cannot access memory at address 0x9999999999999999>, __seq=0x75405a0) at /usr/include/c++/4.9/debug/safe_iterator.h:149
#4  0x00002aaaaf5fbfdc in std::__debug::vector<SvtListener*, std::allocator<SvtListener*> >::end (this=0x75405a0) at /usr/include/c++/4.9/debug/vector:236
#5  0x00002aaaaf5fab18 in SvtBroadcaster::Normalize (this=0x7540598) at /home/julien/compile-libreoffice/libreoffice/svl/source/notify/broadcast.cxx:28
#6  0x00002aaaaf5fb9ca in SvtBroadcaster::Broadcast (this=0x7540598, rHint=...) at /home/julien/compile-libreoffice/libreoffice/svl/source/notify/broadcast.cxx:124
#7  0x00002aaace1881a2 in ScBroadcastAreaSlotMachine::BulkBroadcastGroupAreas (this=0x2c4ead0)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/core/data/bcaslot.cxx:1236
#8  0x00002aaace187cbc in ScBroadcastAreaSlotMachine::LeaveBulkBroadcast (this=0x2c4ead0) at /home/julien/compile-libreoffice/libreoffice/sc/source/core/data/bcaslot.cxx:1196
#9  0x00002aaace3a18df in ScBulkBroadcast::~ScBulkBroadcast (this=0x7fffffff0570, __in_chrg=<optimized out>)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/core/inc/bcaslot.hxx:372
#10 0x00002aaace5f8352 in ScTable::DeleteSelection (this=0x2aaadc872010, nDelFlag=..., rMark=..., bBroadcast=true)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/core/data/table2.cxx:455
#11 0x00002aaace41650d in ScDocument::DeleteSelection (this=0x2c45f68, nDelFlag=..., rMark=..., bBroadcast=true)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/core/data/document.cxx:5569
#12 0x00002aaaceb34f76 in ScDocFunc::DeleteContents (this=0x2b0b5c0, rMark=..., nFlags=..., bRecord=true, bApi=false)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/ui/docshell/docfunc.cxx:635
#13 0x00002aaacf01d5f7 in ScViewFunc::DeleteContents (this=0x30a2378, nFlags=..., bRecord=true)
    at /home/julien/compile-libreoffice/libreoffice/sc/source/ui/view/viewfunc.cxx:1788
Comment 15 Julien Nabet 2014-12-12 23:47:16 UTC
Eike: I noticed broadcast part on bt but in svl, any idea?
Comment 16 Jim Avera 2014-12-13 07:53:07 UTC
Running with valgrind discloses a couple of problems, both of which I think indicate real bugs:

1. Valgrind reports some 8-byte reads which extend 4 bytes beyond the end of malloc'd space.  This would cause unpredictable garbage to be returend in the last 4 bytes, and might possibly explain some Basic macro bugs I've been chasing which vanish when print statements are put in.   The 8 vs. 4 byte lengths might indicate a bug in some low-level casts related to platform word-size.  

This was previously reported but rejected as "NOTABUG" because the offending code is somewhere in Phython bindings.  I don't see how it can not be a bug.

Please see https://bugs.freedesktop.org/show_bug.cgi?id=78513

2. Today I re-ran the test for the present bug under Valgring using the option --free-fill=DE, which causes all free'd heap block to be filled with 0xDE values.  Doing this made the demo behave very differently -- I got several Basic errors I never saw before.   The implication is that LO is referencing memory *after* it has been freed onto the heap. 


In summary, I strongly recommend running under valgrind and tracking down and removing all references to undefined (i.e., not currently allocated) memory.
Comment 17 Jim Avera 2014-12-13 08:21:15 UTC
I've got a cleaner valgrind run, and got the attached trace.  The main thing of interest is:

Conditional jump or move depends on uninitialised value(s)
==12738==    at 0x158DEF48: ??? (in /usr/lib/libreoffice/program/libvclplug_gtklo.so)


To reproduce: 
 1. Copy <instdir>/program/soffice to soffice_patched in the same directory.
 2. At line 107 or thereabouts, add --track-origins=yes --free-fill=DE to
    the valgrind command line, so it looks like this:

    VALGRINDCHECK="valgrind --tool=$VALGRIND --trace-children=yes $valgrind_skip --num-callers=50 --error-limit=no --track-origins=yes --free-fill=DE"

  3. Download the tickers.csv and t2.ods demo files from this bug
  4. <instdir>/program/soffice_patched --valgrind t2.ods >log 2>&1
  5. (patience...) Click the yellow button.   Repeat when everything stops.
     Hopefully it will segfault, but if not a valgrind error will be reported.

     If a pop-up says a copy/paste did not work, click 'Retry' (this occurs when some Basic macro code thinks an operation it just performed had no effect -- which might or might not be an LO bug, but in any case this does not usually happen when not running under valgrind)

After a long while, look in the log file for valgrind errors
Comment 18 Jim Avera 2014-12-13 08:22:39 UTC
Created attachment 110805 [details]
valgrind log showing conditional jump depending on uninit data
Comment 19 Jim Avera 2014-12-13 08:23:46 UTC
Created attachment 110806 [details]
valgrind log showing conditional jump depending on uninit data
Comment 20 Julien Nabet 2014-12-13 15:54:58 UTC
Created attachment 110823 [details]
Valgrind trace with master sources

On pc Debian x86-64 with master sources updated today, I retrieved a Valgrind trace with symbols.
Comment 21 Robinson Tryon (qubit) 2015-12-18 10:49:50 UTC Comment hidden (obsolete)
Comment 22 Julien Nabet 2016-06-12 14:36:08 UTC
On pc Debian x86-64 with LO Debian package 5.1.4.1 (RC1), I don't reproduce this now.

Jim: could you give a new try with recent LO version (last stable one is 5.1.3)?
Comment 23 Jim Avera 2016-06-14 00:23:28 UTC
The 2nd demo (from comment #9) does not crash with 5.1.3.2

However there are some strange warnings written to the terminal about "Icons too large" and "Unknown event notification".   I'll try on Master and file a separate bug report if they persist.

Thanks for following up.  This bug report can be closed.
Comment 24 Xisco Faulí 2016-09-19 16:48:04 UTC Comment hidden (obsolete)