cppu/source/typelibtypelib.cxx, typelib_typedescription_register contains a data race that might cause crashes. In short: typelib_typedescription_register replaces members of already stored type descriptions without synchronizing with other threads that might currently use those members. Crash scenario is as follows: * thread A is processing an IPC call involving an out parameter with struct type (the type description for this struct's bComplete member is 0 up to now) ** thread A is copying the C++ out parameter to the BinaryAny objects for transferring it via IPC, and calls _copyConstructStruct (copy.hxx) ** _copyConstructStruct copies member per member, using the type description's pMemberOffsets and ppTypeRefs member attributes * at the same time thread B also needs this struct type for its work, and tries to complete the struct type description (complete(...) in typelib.cxx). ** after completing the type description, the completed description is registered via typelib_typedescription_register ** typelib_typedescription_register notices that the new description is complete, but the previously stored was not -> it calls typelib_typedescription_destructExtendedMembers(...) on the old, stored typedescription ** typelib_typedescription_destructExtendedMembers(...) deletes the memory used for the pMemberOffsets and ppTypeRefs members (while they are in use by thread A) ** typelib_typedescription_register replaces the members in the stored struct with those from the new struct Depending on the timing, thread A accesses already freed memory (no crash), or sees already different data at the pMemberOffset/ppTypeRefs pointers. The latter case happened for me, pMemberOffsets[0] was 0 and pMemberOffsets[1] was 0, ie. have already been reused for something else. My struct contained a long and a type member, and the code crashed since it wanted to copy the type member at offset 0, which contained a 1 -> access to memory 0x1 instead of just copying the 1 as long value. The issue seems to be present in libreoffice 6.0.0.3 (at least from looking at the typelib code), I found the issue in libreoffice 4.1.4.2
Would be cool, if you submitted a patch to solve it: https://wiki.documentfoundation.org/Development/gerrit
(In reply to Buovjaga from comment #1) > Would be cool, if you submitted a patch to solve it: > https://wiki.documentfoundation.org/Development/gerrit Dear Reporter, Would you like to submit a patch using the link above ?
A workaround could be to not delete the extended data members, but this would create a memory leak. I have no idea how to fix the issue.
@Stephan, would you mind taking a look at the description provided ?
I've submitted a patch proposal to gerrit.
(In reply to straub from comment #5) > I've submitted a patch proposal to gerrit. ...in which case it can't hurt to give a link to it here
https://gerrit.libreoffice.org/#/c/84846/
(In reply to straub from comment #0) > Crash scenario is as follows: > * thread A is processing an IPC call involving an out parameter with struct > type (the type description for this struct's bComplete member is 0 up to now) > ** thread A is copying the C++ out parameter to the BinaryAny objects for > transferring it via IPC, and calls _copyConstructStruct (copy.hxx) > ** _copyConstructStruct copies member per member, using the type > description's pMemberOffsets and ppTypeRefs member attributes > > * at the same time thread B also needs this struct type for its work, and > tries to complete the struct type description (complete(...) in typelib.cxx). > ** after completing the type description, the completed description is > registered via typelib_typedescription_register > ** typelib_typedescription_register notices that the new description is > complete, but the previously stored was not -> it calls > typelib_typedescription_destructExtendedMembers(...) on the old, stored > typedescription > ** typelib_typedescription_destructExtendedMembers(...) deletes the memory > used for the pMemberOffsets and ppTypeRefs members (while they are in use by > thread A) > ** typelib_typedescription_register replaces the members in the stored > struct with those from the new struct Do you happen to have a self-contained reproducer for this? Trying to reconstruct the scenario, I don't even manage to get into a situation where there is a typelib_TypeDescription instance with eTypeClass=typelib_TypeClass_STRUCT and bComplete=false.
I'll try to create a reproducer, bit I'll have to reverse-engineer our scenario that caused this behavior - so it'll take some time.
Created attachment 159778 [details] Example to reproduce my findings
Hi all, I was finally able to create a reproducer that at runs into TypeDescription::makeComplete() of a uno struct, which calls destruct_extended_members(). To achieve this, we need two contributors: 1) Some code that uses the struct's static type, eg. because of calling ::cppu::UnoType< ::foo::XOut >::get() 2) An IPC completing the type struct type description Scenario description: * process 1:uno.bin hosting an adapted foo.Counter component (odk/examples/cpp/counter copied to outstruct, adapted XCountable.idl having an out parameter of type XOut struct) * process 2: uno.bin executable calling modified remoteclient (odk/examples/cpp/remoteclient) which is calling XCountable::outMethod([out] XOut outParam) Attaching to uno.bin hosting outstruct (process 1), we can see that destruct_extended_members() on XOut will be called with the following stacktrace: #0 typelib_typedescription_destructExtendedMembers (pTD=0x7fffe4001b00) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/cppu/source/typelib/typelib.cxx:1283 #1 0x00007ffff7825b2a in typelib_typedescription_register (ppNewDescription=0x7ffff03cb928) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/cppu/source/typelib/typelib.cxx:1493 #2 0x00007ffff782302d in (anonymous namespace)::complete (ppTypeDescr=0x7ffff03cbc08, initTables=true) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/cppu/source/typelib/typelib.cxx:480 #3 0x00007ffff7828084 in typelib_typedescription_complete (ppTypeDescr=0x7ffff03cbc08) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/cppu/source/typelib/typelib.cxx:2379 #4 0x00007ffff03f80a5 in com::sun::star::uno::TypeDescription::makeComplete (this=0x7ffff03cbc08) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/include/typelib/typedescription.hxx:225 #5 0x00007ffff03f7a68 in binaryurp::Marshal::writeValue (this=0x659fc8, buffer=0x7ffff03cbc20, type=..., value=0x7fffe8000c00) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/binaryurp/source/marshal.cxx:191 #6 0x00007ffff03f7724 in binaryurp::Marshal::writeValue (this=0x659fc8, buffer=0x7ffff03cbc20, type=..., value=...) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/binaryurp/source/marshal.cxx:124 #7 0x00007ffff040ab33 in binaryurp::Writer::sendReply (this=0x659ec0, tid=..., member=..., setter=false, exception=false, returnValue=..., outArguments=std::vector of length 1, capacity 1 = {...}) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/binaryurp/source/writer.cxx:390 #8 0x00007ffff0409dd8 in binaryurp::Writer::execute (this=0x659ec0) at /workspaces/socbm451/mastraub/libreoffice6_repo/core/binaryurp/source/writer.cxx:174 #9 0x00007ffff7053d06 in salhelper::Thread::run() () from /workspaces/socbm451/mastraub/libreoffice6_repo/core/instdir/program/libuno_salhelpergcc3.so.3 #10 0x00007ffff7053e9a in threadFunc () from /workspaces/socbm451/mastraub/libreoffice6_repo/core/instdir/program/libuno_salhelpergcc3.so.3 #11 0x00007ffff72a0028 in osl_thread_start_Impl(void*) () from /workspaces/socbm451/mastraub/libreoffice6_repo/core/instdir/program/libuno_sal.so.3 #12 0x00007ffff661ae25 in start_thread (arg=0x7ffff03cc700) at pthread_create.c:308 #13 0x00007ffff634834d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 The type description for XOut is being completed, the old struct member type descriptions destructed (pCTD->ppTypeRefs[i]). However, some other thread might currently use them. I attached a patch that contains this small scenario code. Up to now I was unable to reproduce the 2nd thread using ppTypeRefs[] of the original, incomplete type description. Stephan Bergmann, could you please take a look? Thanks, Oli
To run my example: * apply the diff * check the Makefile in remoteclient/Makefile for any paths that need adaption * make in examples/cpp/outstruct * make remoteclient.run in examples/cpp/remoteclient
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/500c4a1442919715a504e90db592dcce0f31e1a5 Related tdf#115399: Reorder code slightly It will be available in 7.0.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/679163845bd2f8b6de703f8fe704eb8912bc4ba4 tdf#115399: Don't kill pre-existing typelib_TypeDescription members It will be available in 7.0.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
I'm now applying the fix and will test the scenario where we saw the issue. Looking at your changes, I think this fix should work fine. Thank you, Oli
I'm running my testcase with this changeset endlessly, up to now it looks fine, no crash.
After my endless tests, I'm confident that the issue is fixed. Thank you Stephan Bergmann!