Bug 122643 - Crash in: libc-2.27.so after setting Named Ranges to e.g. F:F (non-absolute columns)
Summary: Crash in: libc-2.27.so after setting Named Ranges to e.g. F:F (non-absolute c...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
6.2.0.1 rc
Hardware: All All
: highest critical
Assignee: Luboš Luňák
URL:
Whiteboard: target:6.3.0 target:6.2.2
Keywords: bibisected, bisected, regression
Depends on:
Blocks:
 
Reported: 2019-01-10 18:54 UTC by Jim Avera
Modified: 2019-03-14 11:12 UTC (History)
5 users (show)

See Also:
Crash report or crash signature: ["libc-2.27.so"]


Attachments
Spreadsheet which causes the crash (OT.ods) (62.44 KB, application/vnd.oasis.opendocument.spreadsheet)
2019-01-10 18:54 UTC, Jim Avera
Details
valgrind output (14.17 KB, text/plain)
2019-01-16 10:25 UTC, Xavier Van Wijmeersch
Details
bt with debug symbols (3.25 KB, text/plain)
2019-01-17 09:20 UTC, Julien Nabet
Details
Valgrind trace (4.39 KB, application/x-bzip)
2019-01-17 09:42 UTC, Julien Nabet
Details
Valgrind trace with VALGRIND_OPTS=--fair-sched=yes (3.82 KB, application/gzip)
2019-01-17 19:26 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Avera 2019-01-10 18:54:39 UTC
Created attachment 148226 [details]
Spreadsheet which causes the crash (OT.ods)

This bug was filed from the crash reporting server and is br-9afd5496-22fd-4080-b6a2-c57c7afe989c.
=========================================

Crashes immediately when LO is started on the attached spreadsheet.

Previously, this spreadsheet had many Named Ranges of the form 
   Name => $F:$F
that is, defining Name to be column F in this example.  

However if a preceding column was deleted (for example column C), then the Named ranges were broken because they were not adjusted (in this example, Name was not adjusted to be $E:$E).  I tried to fix this problem by removing the "$" prefix from all the named range definitions, so they were of the form
   Name => F:F
and then saved the spreadsheet.

Afterwards, any attempt to open the spreadsheet caused an immediate crash. 

STEPS TO REPRODUCE:
1. Open attached spreadsheet

RESULTS: Immediate crash, message saying LO was saving files but no files listed.
Comment 1 Xavier Van Wijmeersch 2019-01-10 20:05:01 UTC
I have this gdbtrace, don't know if it will be helpful 

[Detaching after fork from child process 9583]
[New Thread 0x7fffe434d700 (LWP 9586)]
[New Thread 0x7fffe3b4c700 (LWP 9587)]
[Thread 0x7fffe434d700 (LWP 9586) exited]
[New Thread 0x7fffe434d700 (LWP 9598)]
[Thread 0x7fffe434d700 (LWP 9598) exited]
[New Thread 0x7fffe434d700 (LWP 9600)]
[New Thread 0x7fffe1dee700 (LWP 9601)]
[New Thread 0x7fffdf826700 (LWP 9602)]
[New Thread 0x7fffdeec1700 (LWP 9603)]
[Thread 0x7fffdeec1700 (LWP 9603) exited]
[New Thread 0x7fffdeec1700 (LWP 9604)]
[Thread 0x7fffdeec1700 (LWP 9604) exited]
[New Thread 0x7fffdeec1700 (LWP 9605)]
[Thread 0x7fffdeec1700 (LWP 9605) exited]
[New Thread 0x7fffdeec1700 (LWP 9606)]
[Thread 0x7fffdeec1700 (LWP 9606) exited]
[New Thread 0x7fffdeec1700 (LWP 9607)]
[Thread 0x7fffdeec1700 (LWP 9607) exited]
[New Thread 0x7fffdeec1700 (LWP 9608)]
[New Thread 0x7fffde661700 (LWP 9609)]
[Thread 0x7fffdeec1700 (LWP 9608) exited]
[Thread 0x7fffde661700 (LWP 9609) exited]
[Thread 0x7fffdf826700 (LWP 9602) exited]
[New Thread 0x7fffdf826700 (LWP 9610)]
[New Thread 0x7fffde661700 (LWP 9611)]
[Thread 0x7fffde661700 (LWP 9611) exited]
[Thread 0x7fffdf826700 (LWP 9610) exited]
[New Thread 0x7fffdf826700 (LWP 9612)]

Thread 1 "soffice.bin" received signal SIGABRT, Aborted.
0x00007ffff7ccf75b in raise () from /lib64/libc.so.6
#0  0x00007ffff7ccf75b in raise () at /lib64/libc.so.6
#1  0x00007ffff7cb1524 in abort () at /lib64/libc.so.6
#2  0x00007ffff426556e in SalUserEventList::DispatchUserEvents(bool) (this=0x14e4f00, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/source/app/salusereventlist.cxx:125
#3  0x00007ffff7ea199d in KDEXLib::processYield(bool, bool) (this=this@entry=0x537970, bWait=bWait@entry=true, bHandleAllCurrentEvents=bHandleAllCurrentEvents@entry=false) at /home/libreoffice/vcl/unx/kde4/KDESalDisplay.hxx:42
#4  0x00007ffff7ea36cf in KDEXLib::Yield(bool, bool) (this=0x537970, bWait=<optimized out>, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/unx/kde4/KDEXLib.cxx:289
#5  0x00007ffff45c4cf2 in ImplYield(bool, bool) (i_bWait=i_bWait@entry=true, i_bAllEvents=i_bAllEvents@entry=false) at /home/libreoffice/vcl/source/app/svapp.cxx:439
#6  0x00007ffff45c534c in Application::Yield() () at /home/libreoffice/vcl/source/app/svapp.cxx:503
#7  0x00007ffff45c6a85 in Application::Execute() () at /home/libreoffice/vcl/source/app/svapp.cxx:420
#8  0x00007ffff7ee1de3 in desktop::Desktop::Main() (this=0x7fffffffdc60) at /home/libreoffice/desktop/source/app/app.cxx:1637
#9  0x00007ffff45ccf46 in ImplSVMain() () at /home/libreoffice/vcl/source/app/svmain.cxx:199
#10 0x00007ffff7f08cf1 in soffice_main() () at /home/libreoffice/desktop/source/app/sofficemain.cxx:169
#11 0x000000000040107b in sal_main () at /home/libreoffice/desktop/source/app/main.c:48
#12 0x000000000040107b in main (argc=<optimized out>, argv=<optimized out>) at /home/libreoffice/desktop/source/app/main.c:47

Thread 17 (Thread 0x7fffdf826700 (LWP 9612)):
#0  0x00007ffff7c7c567 in pthread_cond_timedwait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00007ffff7fa2718 in osl_waitCondition(oslCondition, TimeValue const*) (Condition=0x232d540, pTimeout=pTimeout@entry=0x7fffdf825a80) at /home/libreoffice/sal/osl/unx/conditn.cxx:200
#2  0x00007fffe470fe4d in osl::Condition::wait(TimeValue const*) (this=<optimized out>, pTimeout=0x7fffdf825a80) at /home/libreoffice/include/osl/conditn.hxx:123
#3  0x00007fffe470fe4d in osl::Condition::wait(TimeValue const&) (timeout=..., this=<optimized out>) at /home/libreoffice/include/osl/conditn.hxx:123
#4  0x00007fffe470fe4d in configmgr::Components::WriteThread::execute() (this=0x232ad00) at /home/libreoffice/configmgr/source/components.cxx:183
#5  0x00007ffff6cebaa6 in salhelper::Thread::run() (this=0x232ad00) at /home/libreoffice/salhelper/source/thread.cxx:40
#6  0x00007ffff6cebc3a in osl::threadFunc(void*) (param=0x232ad10) at /home/libreoffice/include/osl/thread.hxx:185
#7  0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x232af00) at /home/libreoffice/sal/osl/unx/thread.cxx:235
#8  0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0
#9  0x00007ffff7da585f in clone () at /lib64/libc.so.6

Thread 6 (Thread 0x7fffe1dee700 (LWP 9601)):
#0  0x00007ffff7d99aa9 in poll () at /lib64/libc.so.6
#1  0x00007fffeb2bdd53 in x11::SelectionManager::dispatchEvent(int) (this=0x194bb20, millisec=-1) at /home/libreoffice/vcl/unx/generic/dtrans/X11_selection.cxx:3608
#2  0x00007fffeb2bdf9c in x11::SelectionManager::run(void*) (pThis=0x194bb20) at /home/libreoffice/vcl/unx/generic/dtrans/X11_selection.cxx:3645
#3  0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x195e950) at /home/libreoffice/sal/osl/unx/thread.cxx:235
#4  0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0
#5  0x00007ffff7da585f in clone () at /lib64/libc.so.6

Thread 5 (Thread 0x7fffe434d700 (LWP 9600)):
#0  0x00007ffff7d99aa9 in poll () at /lib64/libc.so.6
#1  0x00007fffeb2a1429 in ICEConnectionWorker(void*) (data=0x17ee820) at /home/libreoffice/vcl/unx/generic/app/sm.cxx:728
#2  0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x180f7c0) at /home/libreoffice/sal/osl/unx/thread.cxx:235
#3  0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0
#4  0x00007ffff7da585f in clone () at /lib64/libc.so.6

Thread 3 (Thread 0x7fffe3b4c700 (LWP 9587)):
#0  0x00007ffff7da67c7 in accept () at /lib64/libc.so.6
#1  0x00007ffff7fa8e40 in osl_acceptPipe(oslPipe) (pPipe=0x14da460) at /home/libreoffice/sal/osl/unx/pipe.cxx:416
#2  0x00007ffff7f04232 in osl::Pipe::accept(osl::StreamPipe&) (Connection=..., this=0x14d9a10) at /home/libreoffice/include/osl/pipe.hxx:151
#3  0x00007ffff7f04232 in desktop::PipeIpcThread::execute() (this=0x14d99e0) at /home/libreoffice/desktop/source/app/officeipcthread.cxx:1147
#4  0x00007ffff6cebaa6 in salhelper::Thread::run() (this=0x14d99e0) at /home/libreoffice/salhelper/source/thread.cxx:40
#5  0x00007ffff6cebc3a in osl::threadFunc(void*) (param=0x14d99f0) at /home/libreoffice/include/osl/thread.hxx:185
#6  0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x14d98a0) at /home/libreoffice/sal/osl/unx/thread.cxx:235
#7  0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0
#8  0x00007ffff7da585f in clone () at /lib64/libc.so.6

Thread 1 (Thread 0x7fffed85db40 (LWP 9573)):
#0  0x00007ffff7ccf75b in raise () at /lib64/libc.so.6
#1  0x00007ffff7cb1524 in abort () at /lib64/libc.so.6
#2  0x00007ffff426556e in SalUserEventList::DispatchUserEvents(bool) (this=0x14e4f00, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/source/app/salusereventlist.cxx:125
#3  0x00007ffff7ea199d in KDEXLib::processYield(bool, bool) (this=this@entry=0x537970, bWait=bWait@entry=true, bHandleAllCurrentEvents=bHandleAllCurrentEvents@entry=false) at /home/libreoffice/vcl/unx/kde4/KDESalDisplay.hxx:42
#4  0x00007ffff7ea36cf in KDEXLib::Yield(bool, bool) (this=0x537970, bWait=<optimized out>, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/unx/kde4/KDEXLib.cxx:289
#5  0x00007ffff45c4cf2 in ImplYield(bool, bool) (i_bWait=i_bWait@entry=true, i_bAllEvents=i_bAllEvents@entry=false) at /home/libreoffice/vcl/source/app/svapp.cxx:439
#6  0x00007ffff45c534c in Application::Yield() () at /home/libreoffice/vcl/source/app/svapp.cxx:503
#7  0x00007ffff45c6a85 in Application::Execute() () at /home/libreoffice/vcl/source/app/svapp.cxx:420
#8  0x00007ffff7ee1de3 in desktop::Desktop::Main() (this=0x7fffffffdc60) at /home/libreoffice/desktop/source/app/app.cxx:1637
#9  0x00007ffff45ccf46 in ImplSVMain() () at /home/libreoffice/vcl/source/app/svmain.cxx:199
#10 0x00007ffff7f08cf1 in soffice_main() () at /home/libreoffice/desktop/source/app/sofficemain.cxx:169
#11 0x000000000040107b in sal_main () at /home/libreoffice/desktop/source/app/main.c:48
#12 0x000000000040107b in main (argc=<optimized out>, argv=<optimized out>) at /home/libreoffice/desktop/source/app/main.c:47
A debugging session is active.

	Inferior 1 [process 9573] will be killed.
Comment 2 raal 2019-01-10 20:19:22 UTC
This seems to have begun at the below commit.
Adding Cc: to Luboš Luňák ; Could you possibly take a look at this one?
Thanks

a79e88101e6ecc177dd9de08f7c5fa0fd4a9843b is the first bad commit
commit a79e88101e6ecc177dd9de08f7c5fa0fd4a9843b
Author: Jenkins Build User <tdf@pollux.tdf>
Date:   Wed Oct 10 13:22:44 2018 +0200

    source sha:79449d73900d7a9bf061244d76f5f8eecc441198

author	Luboš Luňák <l.lunak@collabora.com>	2018-10-01 14:26:57 +0200
committer	Luboš Luňák <l.lunak@collabora.com>	2018-10-10 13:01:59 +0200
commit	79449d73900d7a9bf061244d76f5f8eecc441198 (patch)
tree	e85f9bc29941cbf5e5ccb858ee4703ae67d00810
parent	b1721b04d8a921a69230927cd7995d8c5d8f5fe2 (diff)
make VLOOKUP in Calc thread-safe
Comment 3 Xisco Faulí 2019-01-11 13:06:56 UTC
I get this error:

multi_type_vector::position#1570: block position not found! ( logical pos= 1048580, block size=4, logical size=1048576)

in

Versión: 6.2.0.1
Id. de compilación: 0412ee99e862f384c1106d0841a950c4cfaa9df1
Subprocs. CPU: 1; SO: Windows 6.1; Repres. IU: predet.; VCL: win; 
Configuración regional: es-ES (es_ES); Idioma de IU: es-ES
Calc: threaded
Comment 4 Michael Meeks 2019-01-15 09:40:20 UTC
I wonder if we're doing name expansion properly for pre-computing dependencies. Then again - if we're not - surely we'd get a helpful assertion during calculation in a dbgutil build (?). Possibly specific to relative named ranges (which are quite a cute feature ;-). Anyhow - hopefully some useful random speculation.

I imagine a valgrind trace of a dbgutil / symbols build would get closer to the memory corruption going on here rather than gdb.

Thanks !
Comment 5 Xavier Van Wijmeersch 2019-01-16 10:25:05 UTC
Created attachment 148356 [details]
valgrind output

I did a valgrind test, maybe it will help
Comment 6 Michael Meeks 2019-01-16 17:07:58 UTC
Hi Xavier - thanks for that - looks like you were tracing bash =) (?) Can you try:

soffice --valgrind

which should do this right. Thanks !
Comment 7 Julien Nabet 2019-01-17 09:20:04 UTC
Created attachment 148390 [details]
bt with debug symbols

On pc Debian x86-64 with master sources updated yesterday, I reproduced this.
Indeed bt doesn't help but there are some console logs too.

I'll attach Valgrind trace soon.
Comment 8 Julien Nabet 2019-01-17 09:42:40 UTC
Created attachment 148392 [details]
Valgrind trace

Here's the Valgrind trace but I didn't see anything interesting since there are only errors concerning dlopen function.
(unless I missed it?)
Comment 9 Michael Meeks 2019-01-17 10:23:02 UTC
Thanks Julien; as you say - all false positives from glibc internals assuming knowledge of their own allocator (which is fine).

The DispatchUserEvents thing really looks like memory corruption though:

warn:vcl:11210:11210:vcl/source/app/salusereventlist.cxx:120: Uncaught St12out_of_range multi_type_vector::position#1570: block position not found! (logical pos=1048580, block size=4, logical size=1048576)

I would assume that in fact under valgrind we don't get a crash either ;-) which is annoying - presumably it perterbs threading so the issue doesn't happen.

What might help is getting another valgrind trace when run with:

--fair-sched=yes

It is also possible that we need to substantially reduce this eg:

coregrind/m_scheduler/scheduler.c:#define SCHEDULING_QUANTUM   1200 // 100000

To get a better approximation of fast context switching to catch the race. It is somewhat odd that this isn't a cmd-line option for valgrind particularly since there is no performance benefit of it not being configurable; possibly an easy-hack for valgrind lurks there =)
Comment 10 Julien Nabet 2019-01-17 19:26:20 UTC
Created attachment 148407 [details]
Valgrind trace with VALGRIND_OPTS=--fair-sched=yes

Here is another Valgrind trace.
I typed this:
export VALGRIND_OPTS=--fair-sched=yes
then
./soffice --norestore --nologo --valgrind /tmp/OT.ods  >& /tmp/valgrind.log

I must recognize I didn't understand this part:
coregrind/m_scheduler/scheduler.c:#define SCHEDULING_QUANTUM   1200 // 100000
Comment 11 Michael Meeks 2019-01-17 21:15:39 UTC
Interesting; I guess then that it is not a heap corruption - perhaps a stack one: thanks for the trace ! =)
Comment 12 Luboš Luňák 2019-02-25 14:39:30 UTC
https://gerrit.libreoffice.org/#/c/68349/
Comment 13 Commit Notification 2019-02-28 15:53:44 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/beba45a5639bc32ca6893885ca3b1f07e3175c08%5E%21

avoid std::out_of_range thrown by mdds (tdf#122643)

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 14 Michael Meeks 2019-02-28 15:54:37 UTC
Thanks Lubos ! =)
Comment 15 Commit Notification 2019-03-01 11:32:39 UTC
Luboš Luňák committed a patch related to this issue.
It has been pushed to "libreoffice-6-2":

https://git.libreoffice.org/core/+/b21d76201e8355207721cc442fc4f204f3061a76%5E%21

avoid std::out_of_range thrown by mdds (tdf#122643)

It will be available in 6.2.2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Jim Avera 2019-03-01 17:52:30 UTC
Confirming the crash no longer happens with

Version: 6.3.0.0.alpha0+
Build ID: f23738139429358c11fa62708fbdf5bb0c43d199
CPU threads: 12; OS: Linux 4.18; UI render: default; VCL: gtk3; 
TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2019-02-28_20:14:57
Locale: en-US (en_US.UTF-8); UI-Language: en-US
Calc: threaded

Thanks!