Created attachment 107659 [details] gdb log crashes in threaded .xlsb imort
Thread #1, frame #13, FormulaParser::importFormula (this=0x2bf0aa0... Thread #8, frame #23, FormulaParser::importFormula (this=0x2bf0aa... Two different threads, and two different SheetDataContexts are using the same FormulaParser, which has a FormulaaParserImpl and the two threads are stomping all over the same maTokenIndexes etc
Ooh - silly; thanks ! taking that ...
So I don't really see how this can work at all, I mean there is a single FormulaParser for the import and multiple threads sharing it. The FormulaParser has state. Its api claims constness, but its a tissue of lies seeing as it forwards everything to a non-const pointer to a pImpl which is riddled with state as far as I can see. Presumably locking calls to the shared SheetDataContext mrFormulaParser defeats the purpose, so have a separate formula parser per SheetDataContext ? Document is fdo60899-1.xlsb FWIW
The xlsb in question is from bug#60899 https://bugs.freedesktop.org/attachment.cgi?id=74875
Riight - having separate FormulaParsers seems an attractive option; hmm. I'll look at it over the next days.
Created attachment 107661 [details] /tmp/debug.patch Heres how I generated the bt. My debugging hack. Because there is only one FormulaParserImpl then this works, obviously aHack would need to be a member of FormulaParserImpl otherwise. About 1 or 2 times out of 5 the above clean bt is generated, otherwise its somewhere else.
http://cgit.freedesktop.org/libreoffice/core/commit/?id=17e68606b7c5001edf2ebede5e7d5ea0e0b9753f
Oh, wow - thanks for that - I was going to look at it on the plane. My hope was that the amount of 'real' parsing we do should be small in most cases - the predictive reverse printing & comparison being adequate for most of the big sheet cases; such that we could (should) have protected all the deeper parsing code with the solar mutex - but - of course, if we have a better, faster solution like this - that's great =) Thanks !
Its a simple solution anyway. No idea if its the fastest one, but hard to imagine micro locking a shared formulaparser would be better, but maybe it is for all I know