Created attachment 148226 [details] Spreadsheet which causes the crash (OT.ods) This bug was filed from the crash reporting server and is br-9afd5496-22fd-4080-b6a2-c57c7afe989c. ========================================= Crashes immediately when LO is started on the attached spreadsheet. Previously, this spreadsheet had many Named Ranges of the form Name => $F:$F that is, defining Name to be column F in this example. However if a preceding column was deleted (for example column C), then the Named ranges were broken because they were not adjusted (in this example, Name was not adjusted to be $E:$E). I tried to fix this problem by removing the "$" prefix from all the named range definitions, so they were of the form Name => F:F and then saved the spreadsheet. Afterwards, any attempt to open the spreadsheet caused an immediate crash. STEPS TO REPRODUCE: 1. Open attached spreadsheet RESULTS: Immediate crash, message saying LO was saving files but no files listed.
I have this gdbtrace, don't know if it will be helpful [Detaching after fork from child process 9583] [New Thread 0x7fffe434d700 (LWP 9586)] [New Thread 0x7fffe3b4c700 (LWP 9587)] [Thread 0x7fffe434d700 (LWP 9586) exited] [New Thread 0x7fffe434d700 (LWP 9598)] [Thread 0x7fffe434d700 (LWP 9598) exited] [New Thread 0x7fffe434d700 (LWP 9600)] [New Thread 0x7fffe1dee700 (LWP 9601)] [New Thread 0x7fffdf826700 (LWP 9602)] [New Thread 0x7fffdeec1700 (LWP 9603)] [Thread 0x7fffdeec1700 (LWP 9603) exited] [New Thread 0x7fffdeec1700 (LWP 9604)] [Thread 0x7fffdeec1700 (LWP 9604) exited] [New Thread 0x7fffdeec1700 (LWP 9605)] [Thread 0x7fffdeec1700 (LWP 9605) exited] [New Thread 0x7fffdeec1700 (LWP 9606)] [Thread 0x7fffdeec1700 (LWP 9606) exited] [New Thread 0x7fffdeec1700 (LWP 9607)] [Thread 0x7fffdeec1700 (LWP 9607) exited] [New Thread 0x7fffdeec1700 (LWP 9608)] [New Thread 0x7fffde661700 (LWP 9609)] [Thread 0x7fffdeec1700 (LWP 9608) exited] [Thread 0x7fffde661700 (LWP 9609) exited] [Thread 0x7fffdf826700 (LWP 9602) exited] [New Thread 0x7fffdf826700 (LWP 9610)] [New Thread 0x7fffde661700 (LWP 9611)] [Thread 0x7fffde661700 (LWP 9611) exited] [Thread 0x7fffdf826700 (LWP 9610) exited] [New Thread 0x7fffdf826700 (LWP 9612)] Thread 1 "soffice.bin" received signal SIGABRT, Aborted. 0x00007ffff7ccf75b in raise () from /lib64/libc.so.6 #0 0x00007ffff7ccf75b in raise () at /lib64/libc.so.6 #1 0x00007ffff7cb1524 in abort () at /lib64/libc.so.6 #2 0x00007ffff426556e in SalUserEventList::DispatchUserEvents(bool) (this=0x14e4f00, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/source/app/salusereventlist.cxx:125 #3 0x00007ffff7ea199d in KDEXLib::processYield(bool, bool) (this=this@entry=0x537970, bWait=bWait@entry=true, bHandleAllCurrentEvents=bHandleAllCurrentEvents@entry=false) at /home/libreoffice/vcl/unx/kde4/KDESalDisplay.hxx:42 #4 0x00007ffff7ea36cf in KDEXLib::Yield(bool, bool) (this=0x537970, bWait=<optimized out>, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/unx/kde4/KDEXLib.cxx:289 #5 0x00007ffff45c4cf2 in ImplYield(bool, bool) (i_bWait=i_bWait@entry=true, i_bAllEvents=i_bAllEvents@entry=false) at /home/libreoffice/vcl/source/app/svapp.cxx:439 #6 0x00007ffff45c534c in Application::Yield() () at /home/libreoffice/vcl/source/app/svapp.cxx:503 #7 0x00007ffff45c6a85 in Application::Execute() () at /home/libreoffice/vcl/source/app/svapp.cxx:420 #8 0x00007ffff7ee1de3 in desktop::Desktop::Main() (this=0x7fffffffdc60) at /home/libreoffice/desktop/source/app/app.cxx:1637 #9 0x00007ffff45ccf46 in ImplSVMain() () at /home/libreoffice/vcl/source/app/svmain.cxx:199 #10 0x00007ffff7f08cf1 in soffice_main() () at /home/libreoffice/desktop/source/app/sofficemain.cxx:169 #11 0x000000000040107b in sal_main () at /home/libreoffice/desktop/source/app/main.c:48 #12 0x000000000040107b in main (argc=<optimized out>, argv=<optimized out>) at /home/libreoffice/desktop/source/app/main.c:47 Thread 17 (Thread 0x7fffdf826700 (LWP 9612)): #0 0x00007ffff7c7c567 in pthread_cond_timedwait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x00007ffff7fa2718 in osl_waitCondition(oslCondition, TimeValue const*) (Condition=0x232d540, pTimeout=pTimeout@entry=0x7fffdf825a80) at /home/libreoffice/sal/osl/unx/conditn.cxx:200 #2 0x00007fffe470fe4d in osl::Condition::wait(TimeValue const*) (this=<optimized out>, pTimeout=0x7fffdf825a80) at /home/libreoffice/include/osl/conditn.hxx:123 #3 0x00007fffe470fe4d in osl::Condition::wait(TimeValue const&) (timeout=..., this=<optimized out>) at /home/libreoffice/include/osl/conditn.hxx:123 #4 0x00007fffe470fe4d in configmgr::Components::WriteThread::execute() (this=0x232ad00) at /home/libreoffice/configmgr/source/components.cxx:183 #5 0x00007ffff6cebaa6 in salhelper::Thread::run() (this=0x232ad00) at /home/libreoffice/salhelper/source/thread.cxx:40 #6 0x00007ffff6cebc3a in osl::threadFunc(void*) (param=0x232ad10) at /home/libreoffice/include/osl/thread.hxx:185 #7 0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x232af00) at /home/libreoffice/sal/osl/unx/thread.cxx:235 #8 0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0 #9 0x00007ffff7da585f in clone () at /lib64/libc.so.6 Thread 6 (Thread 0x7fffe1dee700 (LWP 9601)): #0 0x00007ffff7d99aa9 in poll () at /lib64/libc.so.6 #1 0x00007fffeb2bdd53 in x11::SelectionManager::dispatchEvent(int) (this=0x194bb20, millisec=-1) at /home/libreoffice/vcl/unx/generic/dtrans/X11_selection.cxx:3608 #2 0x00007fffeb2bdf9c in x11::SelectionManager::run(void*) (pThis=0x194bb20) at /home/libreoffice/vcl/unx/generic/dtrans/X11_selection.cxx:3645 #3 0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x195e950) at /home/libreoffice/sal/osl/unx/thread.cxx:235 #4 0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0 #5 0x00007ffff7da585f in clone () at /lib64/libc.so.6 Thread 5 (Thread 0x7fffe434d700 (LWP 9600)): #0 0x00007ffff7d99aa9 in poll () at /lib64/libc.so.6 #1 0x00007fffeb2a1429 in ICEConnectionWorker(void*) (data=0x17ee820) at /home/libreoffice/vcl/unx/generic/app/sm.cxx:728 #2 0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x180f7c0) at /home/libreoffice/sal/osl/unx/thread.cxx:235 #3 0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0 #4 0x00007ffff7da585f in clone () at /lib64/libc.so.6 Thread 3 (Thread 0x7fffe3b4c700 (LWP 9587)): #0 0x00007ffff7da67c7 in accept () at /lib64/libc.so.6 #1 0x00007ffff7fa8e40 in osl_acceptPipe(oslPipe) (pPipe=0x14da460) at /home/libreoffice/sal/osl/unx/pipe.cxx:416 #2 0x00007ffff7f04232 in osl::Pipe::accept(osl::StreamPipe&) (Connection=..., this=0x14d9a10) at /home/libreoffice/include/osl/pipe.hxx:151 #3 0x00007ffff7f04232 in desktop::PipeIpcThread::execute() (this=0x14d99e0) at /home/libreoffice/desktop/source/app/officeipcthread.cxx:1147 #4 0x00007ffff6cebaa6 in salhelper::Thread::run() (this=0x14d99e0) at /home/libreoffice/salhelper/source/thread.cxx:40 #5 0x00007ffff6cebc3a in osl::threadFunc(void*) (param=0x14d99f0) at /home/libreoffice/include/osl/thread.hxx:185 #6 0x00007ffff7fae438 in osl_thread_start_Impl(void*) (pData=0x14d98a0) at /home/libreoffice/sal/osl/unx/thread.cxx:235 #7 0x00007ffff7c7617e in start_thread () at /lib64/libpthread.so.0 #8 0x00007ffff7da585f in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7fffed85db40 (LWP 9573)): #0 0x00007ffff7ccf75b in raise () at /lib64/libc.so.6 #1 0x00007ffff7cb1524 in abort () at /lib64/libc.so.6 #2 0x00007ffff426556e in SalUserEventList::DispatchUserEvents(bool) (this=0x14e4f00, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/source/app/salusereventlist.cxx:125 #3 0x00007ffff7ea199d in KDEXLib::processYield(bool, bool) (this=this@entry=0x537970, bWait=bWait@entry=true, bHandleAllCurrentEvents=bHandleAllCurrentEvents@entry=false) at /home/libreoffice/vcl/unx/kde4/KDESalDisplay.hxx:42 #4 0x00007ffff7ea36cf in KDEXLib::Yield(bool, bool) (this=0x537970, bWait=<optimized out>, bHandleAllCurrentEvents=<optimized out>) at /home/libreoffice/vcl/unx/kde4/KDEXLib.cxx:289 #5 0x00007ffff45c4cf2 in ImplYield(bool, bool) (i_bWait=i_bWait@entry=true, i_bAllEvents=i_bAllEvents@entry=false) at /home/libreoffice/vcl/source/app/svapp.cxx:439 #6 0x00007ffff45c534c in Application::Yield() () at /home/libreoffice/vcl/source/app/svapp.cxx:503 #7 0x00007ffff45c6a85 in Application::Execute() () at /home/libreoffice/vcl/source/app/svapp.cxx:420 #8 0x00007ffff7ee1de3 in desktop::Desktop::Main() (this=0x7fffffffdc60) at /home/libreoffice/desktop/source/app/app.cxx:1637 #9 0x00007ffff45ccf46 in ImplSVMain() () at /home/libreoffice/vcl/source/app/svmain.cxx:199 #10 0x00007ffff7f08cf1 in soffice_main() () at /home/libreoffice/desktop/source/app/sofficemain.cxx:169 #11 0x000000000040107b in sal_main () at /home/libreoffice/desktop/source/app/main.c:48 #12 0x000000000040107b in main (argc=<optimized out>, argv=<optimized out>) at /home/libreoffice/desktop/source/app/main.c:47 A debugging session is active. Inferior 1 [process 9573] will be killed.
This seems to have begun at the below commit. Adding Cc: to Luboš Luňák ; Could you possibly take a look at this one? Thanks a79e88101e6ecc177dd9de08f7c5fa0fd4a9843b is the first bad commit commit a79e88101e6ecc177dd9de08f7c5fa0fd4a9843b Author: Jenkins Build User <tdf@pollux.tdf> Date: Wed Oct 10 13:22:44 2018 +0200 source 79449d73900d7a9bf061244d76f5f8eecc441198 author Luboš Luňák <l.lunak@collabora.com> 2018-10-01 14:26:57 +0200 committer Luboš Luňák <l.lunak@collabora.com> 2018-10-10 13:01:59 +0200 commit 79449d73900d7a9bf061244d76f5f8eecc441198 (patch) tree e85f9bc29941cbf5e5ccb858ee4703ae67d00810 parent b1721b04d8a921a69230927cd7995d8c5d8f5fe2 (diff) make VLOOKUP in Calc thread-safe
I get this error: multi_type_vector::position#1570: block position not found! ( logical pos= 1048580, block size=4, logical size=1048576) in Versión: 6.2.0.1 Id. de compilación: 0412ee99e862f384c1106d0841a950c4cfaa9df1 Subprocs. CPU: 1; SO: Windows 6.1; Repres. IU: predet.; VCL: win; Configuración regional: es-ES (es_ES); Idioma de IU: es-ES Calc: threaded
I wonder if we're doing name expansion properly for pre-computing dependencies. Then again - if we're not - surely we'd get a helpful assertion during calculation in a dbgutil build (?). Possibly specific to relative named ranges (which are quite a cute feature ;-). Anyhow - hopefully some useful random speculation. I imagine a valgrind trace of a dbgutil / symbols build would get closer to the memory corruption going on here rather than gdb. Thanks !
Created attachment 148356 [details] valgrind output I did a valgrind test, maybe it will help
Hi Xavier - thanks for that - looks like you were tracing bash =) (?) Can you try: soffice --valgrind which should do this right. Thanks !
Created attachment 148390 [details] bt with debug symbols On pc Debian x86-64 with master sources updated yesterday, I reproduced this. Indeed bt doesn't help but there are some console logs too. I'll attach Valgrind trace soon.
Created attachment 148392 [details] Valgrind trace Here's the Valgrind trace but I didn't see anything interesting since there are only errors concerning dlopen function. (unless I missed it?)
Thanks Julien; as you say - all false positives from glibc internals assuming knowledge of their own allocator (which is fine). The DispatchUserEvents thing really looks like memory corruption though: warn:vcl:11210:11210:vcl/source/app/salusereventlist.cxx:120: Uncaught St12out_of_range multi_type_vector::position#1570: block position not found! (logical pos=1048580, block size=4, logical size=1048576) I would assume that in fact under valgrind we don't get a crash either ;-) which is annoying - presumably it perterbs threading so the issue doesn't happen. What might help is getting another valgrind trace when run with: --fair-sched=yes It is also possible that we need to substantially reduce this eg: coregrind/m_scheduler/scheduler.c:#define SCHEDULING_QUANTUM 1200 // 100000 To get a better approximation of fast context switching to catch the race. It is somewhat odd that this isn't a cmd-line option for valgrind particularly since there is no performance benefit of it not being configurable; possibly an easy-hack for valgrind lurks there =)
Created attachment 148407 [details] Valgrind trace with VALGRIND_OPTS=--fair-sched=yes Here is another Valgrind trace. I typed this: export VALGRIND_OPTS=--fair-sched=yes then ./soffice --norestore --nologo --valgrind /tmp/OT.ods >& /tmp/valgrind.log I must recognize I didn't understand this part: coregrind/m_scheduler/scheduler.c:#define SCHEDULING_QUANTUM 1200 // 100000
Interesting; I guess then that it is not a heap corruption - perhaps a stack one: thanks for the trace ! =)
https://gerrit.libreoffice.org/#/c/68349/
Luboš Luňák committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/+/beba45a5639bc32ca6893885ca3b1f07e3175c08%5E%21 avoid std::out_of_range thrown by mdds (tdf#122643) It will be available in 6.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Thanks Lubos ! =)
Luboš Luňák committed a patch related to this issue. It has been pushed to "libreoffice-6-2": https://git.libreoffice.org/core/+/b21d76201e8355207721cc442fc4f204f3061a76%5E%21 avoid std::out_of_range thrown by mdds (tdf#122643) It will be available in 6.2.2. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Confirming the crash no longer happens with Version: 6.3.0.0.alpha0+ Build ID: f23738139429358c11fa62708fbdf5bb0c43d199 CPU threads: 12; OS: Linux 4.18; UI render: default; VCL: gtk3; TinderBox: Linux-rpm_deb-x86_64@86-TDF, Branch:master, Time: 2019-02-28_20:14:57 Locale: en-US (en_US.UTF-8); UI-Language: en-US Calc: threaded Thanks!
Xisco Fauli committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/09192bc178f7f7b21ef63508f516f52790f3307d tdf#122643: sc_subsequent_filters: Add unittest It will be available in 7.2.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.