Created attachment 64078 [details] minimal test file In libreoffice-3-6 and master gsicheck utility segfaults (tested on Linux 64-bit). I have to use gsicheck from libreoffice-3-5. It segfault at the first occurence of a help xml tag. $ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/home/timar/libreoffice-master/solver/unxlngx6.pro/lib $ gdb /home/timar/libreoffice-master/solver/unxlngx6.pro/bin/gsicheck (gdb) run -c -wef hu.err hu.sdf > hu.log Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7b92162 in rtl_ustr_asciil_reverseEquals_WithLength (pStr1=Cannot access memory at address 0x7fffff7fef48 ) at /home/timar/libreoffice-master/sal/rtl/source/ustring.cxx:288 288 { (gdb) bt #0 0x00007ffff7b92162 in rtl_ustr_asciil_reverseEquals_WithLength (pStr1=Cannot access memory at address 0x7fffff7fef48 ) at /home/timar/libreoffice-master/sal/rtl/source/ustring.cxx:288 #1 0x00007ffff7b91a2c in rtl_ustr_indexOfAscii_WithLength (str=0x7ffff7f5b0b2, len=66, subStr=0x418925 "$[", subLen=2) at /home/timar/libreoffice-master/sal/rtl/source/ustring.cxx:96 #2 0x0000000000412329 in rtl::OUString::indexOfAsciiL (this=0x622388, str=0x418925 "$[", len=2, fromIndex=1) at /home/timar/libreoffice-master/solver/unxlngx6.pro/inc/rtl/ustring.hxx:1188 #3 0x000000000040ef5b in SimpleParser::GetNextTokenString (this=0x622380, rErrorList=..., rTagStartPos=@0x7fffffffcafc) at /home/timar/libreoffice-master/l10ntools/source/tagtest.cxx:763 #4 0x000000000040f08c in SimpleParser::GetNextTokenString (this=0x622380, rErrorList=..., rTagStartPos=@0x7fffffffcafc) at /home/timar/libreoffice-master/l10ntools/source/tagtest.cxx:787
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=62342abac0e3a38e39a50b7560f09cbdeb62905a fdo#51954: -1 is small while STRING_NOTFOUND was great
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=63cb8d6bb21ad6bb401efa4eca479f89745c1cfe fdo#51954: More tools->rtl string conversion regressions
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=de3d6883f2a8fe9c5c04b8e271a36423f96950bb Regression fix correction
After 3rd patch, no crash, no hang, no regression because of handling "\\<". Problems: 1. assertion failed when processing dgo.sdf timar@timar-corei7:~/LibO36l10n/aaa> /home/timar/libreoffice-master/solver/unxlngx6.pro/bin/gsicheck -c -wef dgo1.err dgo.sdf > dgo1.log gsicheck: /home/timar/libreoffice-master/solver/unxlngx6.pro/inc/rtl/string.hxx:1038: rtl::OString rtl::OString::copy(sal_Int32, sal_Int32) const: Assertion `beginIndex >= 0 && beginIndex <= getLength() && count >= 0 && sal::static_int_cast<sal_uInt32>(count) <= sal::static_int_cast<sal_uInt32>(getLength() - beginIndex)' failed. Aborted Please find dgo.sdf here: http://dl.dropbox.com/u/8912433/dgo.sdf.bz2 2. err file format changed (CR/LF line endings instead of LF) - not a real issue 3. log file format changed - it tries to gives more information of the errors now, for example: Error: Translation Tag Mismatch, Line 28537, UniqueID helpcontent2/source\text\scalc\01\04060102.xhp/help/par_id6354457///: Property 'href': value different in Translation : \<embedvar href=\"text/scalc/01/func_datedif.xhp#datedif\"/\> at Position 0 "helpcontent2 source\text\scalc\01\04060102.xhp 0 help par_id6354457 0 de \<embedvar href=\"text/scalc/01/func_date.xhp#date\"/\> 2002-02-02 02:02:02" This is OK. Maybe it would be enough to print the text field only, other info is redundant or not important, and we will find position of error easier. However, many times it does not print the whole sdf line, e.g.: Error: Translation Tag Mismatch, Line 26029, UniqueID helpcontent2/source\text\scalc\01\01120000.xhp/help/par_id9838862///: Extra Tag in Translation: \<switchinline select=\"sys\"\> at Position 178 "lc\01\01120000.xhp 0 help par_id9838862 0 de Sie können auch die Tasten \<switchinline select=\"sys\"\>\<caseinline select=\"MAC\"\>Befehl\</caseinline\>\<defaultinline\>Strg\</defaultinline\>\</switchinline\>+Bild auf und \<switchinline select=\"sys\"\>\<caseinline select=\"MAC\"\>Befehl\</case" http://dl.dropbox.com/u/8912433/de.sdf.bz2
Stephan Bergmann committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=05a0aaa25efc53c4cfa8a955dbf96bbb63b8bc98 fdo#51954: More tools->rtl string conversion regressions
(In reply to comment #4) > 1. assertion failed when processing dgo.sdf > timar@timar-corei7:~/LibO36l10n/aaa> > /home/timar/libreoffice-master/solver/unxlngx6.pro/bin/gsicheck -c -wef > dgo1.err dgo.sdf > dgo1.log > gsicheck: > /home/timar/libreoffice-master/solver/unxlngx6.pro/inc/rtl/string.hxx:1038: > rtl::OString rtl::OString::copy(sal_Int32, sal_Int32) const: Assertion > `beginIndex >= 0 && beginIndex <= getLength() && count >= 0 && > sal::static_int_cast<sal_uInt32>(count) <= > sal::static_int_cast<sal_uInt32>(getLength() - beginIndex)' failed. > Aborted This is fixed now, see comment 5. > 3. log file format changed - it tries to gives more information of the errors > now, for example: [...] > This is OK. Maybe it would be enough to print the text field only, other info > is redundant or not important, and we will find position of error easier. > However, many times it does not print the whole sdf line, e.g.: I would assume this is due to changes like (GSIBlock::PrintList in l10ntools/source/gsicheck.cxx) from aContext = pLine->Copy( pMsg->GetTagBegin()-150, 300 ); to aContext = helper::abbreviate( pLine->data_, pMsg->GetTagBegin()-150, 300 ); where the code's intent apparently always was to produce just an abbreviated part of the line, 150 characters to the left and right of pMsg->GetTagBegin(). However, for the old, tools String based code, where xub_StrLen is sal_uInt16 (i.e., unsigned), if pMsg->GetTagBegin()-150 < 0 (i.e., wrap-around to some very large value > pLine->Len()), then the Copy would silently produce an empty string (while the new, rtl::OUString based code's copy resulted in assertions that have been fixed with the introduction of helper::abbreviate).
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-3-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=4bc3473be15e362108b687ee94ce748947e3aad9&g=libreoffice-3-6 fdo#51954: -1 is small while STRING_NOTFOUND was great It will be available in LibreOffice 3.6.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-3-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=5085fb2b0a8e74e5cdc67487c05c605570e8c71b&g=libreoffice-3-6 fdo#51954: More tools->rtl string conversion regressions It will be available in LibreOffice 3.6.
Stephan Bergmann committed a patch related to this issue. It has been pushed to "libreoffice-3-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=9865dbc7af9331ccb1a8e55a09dd821fb014a5a4&g=libreoffice-3-6 fdo#51954: More tools->rtl string conversion regressions It will be available in LibreOffice 3.6.