Created attachment 66601 [details] backtrace Just got several aborts on failed assertion in a row typing text using bullet lists in Writer (in an ODT file, not a .doc file), whose language is fr_LU. It seems to happen most often a bit like this: 1) type some text in a bullet point 2) do not save 3) type enter --> crash 4) if not crashed yet, type a few characters (especially a mix of letters and numbers: something like 12345XX, then a space) ---> crash I attach the backtrace of one of these crashes, and here is some more info (gdb) up 17 #17 0x00007fe016e28e4b in rtl::OUString::copy (this=this@entry=0x7fff094ad240, beginIndex=beginIndex@entry=5) at /home/master/src/libreoffice/workdirs/libreoffice-3.6/solver/unxlngx6/inc/rtl/ustring.hxx:1293 1293 assert(beginIndex >= 0 && beginIndex <= getLength()); (gdb) print beginIndex $1 = 5 (gdb) print this->pData->length $3 = 4 (gdb) up #18 0x00007fe016e2b307 in com::sun::star::i18n::OrdinalSuffix::getOrdinalSuffix (this=0x31a43b0, nNumber=42221, aLocale=...) at /home/master/src/libreoffice/workdirs/libreoffice-3.6/i18npool/source/ordinalsuffix/ordinalsuffix.cxx:102 102 retValue[ newLength - 1 ] = sValue.copy( len ); (gdb) print aLocale $4 = (const com::sun::star::lang::Locale &) @0x20d7a60: { Language = "fr", Country = "LU", Variant = "" } (gdb) print sValue $5 = "NaNe" (gdb) print newLength $6 = 1 (gdb) print len $7 = 5 (gdb) print normalized This sends GDB in a infinite (or very long) memory-consuming loop From *another* crash (same backtrace, same failed abort, same sValue, same len, same newLength): (gdb) print nCode $3 = U_ZERO_ERROR (gdb) print i $4 = 0 (gdb) print ruleSet again infinite loop
So... a) this is autocorrect to change e.g. "1st" to 1 + superscript st. b) we use the number formatter thing in icu to add the suffix, use the sal one to find the bare number and subtract the two to get the suffix on its own. c) we're not returning the values if they are the expected ones, which we should as that's the whole point of the exercise I thought d) There's at least one bug there already in that "12345" is converted by icu to "12.345e" for fr for me while en remains as 12345th and we're not taking into account the addition of a , in finding the suffix e) that said, I can't reproduce the crash on master by setting LANG=fr_LU.UTF8, the NaNe is presumably a "not a number" error so something in the stack has presumably gotten confused about something like , vs . Are you using the default built-in icu or a system one ? And is locale is just set with LANG=fr_LU.UTF8 and letting LibO just take its defaults from that.
or because the ordinal suffix for French is "e" something reads that as exponent
Even though I can't reproduce this. I can see other non-fatal errors and committed http://cgit.freedesktop.org/libreoffice/core/commit/?id=a05357ab69712bec53c2d8d17efbbf25907ff9b8 to fix them. The side effect I bet is to now not crash on the (bizarre) "NaNe" output. I'd still like to know the details of comment #1
(In reply to comment #1) > e) that said, I can't reproduce the crash on master by setting LANG=fr_LU.UTF8, > Are you using the default built-in icu or a system one ? My understanding is: System ICU. My config.log says: configure:24098: checking which icu to use configure:24101: result: external (...) configure:24136: checking for icu-config configure:24154: found /usr/bin/icu-config configure:24166: result: /usr/bin/icu-config configure:24175: checking ICU version configure:24183: result: OK, 4.4.1 and my config_host.mk has: config_host.mk:export ICU_MAJOR=4 config_host.mk:export ICU_MICRO=1 config_host.mk:export ICU_MINOR=4 config_host.mk:export SYSTEM_ICU=YES Package versions (Debian amd64): libicu-dev 4.4.1-8 libicu44 4.4.1-8 libicu48 4.8.1.1-7 libicu4j-java 4.0.1.1-1 > And is locale is just set with LANG=fr_LU.UTF8 and letting LibO just take its defaults from that. $ locale LANG=fr_LU.UTF-8 LANGUAGE= LC_CTYPE="fr_LU.UTF-8" LC_NUMERIC="fr_LU.UTF-8" LC_TIME="fr_LU.UTF-8" LC_COLLATE="fr_LU.UTF-8" LC_MONETARY="fr_LU.UTF-8" LC_MESSAGES=en_GB.UTF-8 LC_PAPER="fr_LU.UTF-8" LC_NAME="fr_LU.UTF-8" LC_ADDRESS="fr_LU.UTF-8" LC_TELEPHONE="fr_LU.UTF-8" LC_MEASUREMENT="fr_LU.UTF-8" LC_IDENTIFICATION="fr_LU.UTF-8" LC_ALL= $ set | egrep '(LANG|LC_)' LANG=fr_LU.UTF-8 LC_MESSAGES=en_GB.UTF-8 In LibO, menu Tools / Options / Language Settings / Languages has: Language of User interface: English (USA) Locale setting: Default - French (Luxembourg) Decimal separator key: (checked) Same as locale setting ( . ) Default currency: Default - EUR Default languages for documents Western: ABC_V Default - French (Luxembourg) Asian: (greyed out) Default - Chinese (simplified) CTL: (greyed out) Default - Hindi Enhanced language support (unchecked) Enabled for Asian languages (unchecked) Enabled for complex text layout (CTL) Note that the locale's decimal separator key is comma, not dot. So the "same as locale setting" is weird, it says it is a dot. If I uncheck that box, I don't see a way to choose between dot or comma.
The abort happens if I simply open a new writer document and type "47211 " (without the quotes: 47211 then a space). Also happens with these numbers followed by space: 20000 12345 123456 99999999999999999 10000 Does not happen with these numbers followed by space: 1 11 211 7211 1234 9999 So it seems to be related to length of the number. If I select Tools / Language / For all text / English (USA), then: no crash with 12345, but crash with 999999 and 123456. So it seems it crashes, too, but only on longer numbers. OTOH, 1st is not autocorrected to 1\textsuperscript{st}, neither 2nd to 2\textsuperscript{nd}. If I select Tools / Language / For all text / None (Do not check spelling), then no crash. If I select Tools / Language / For all text / Welsh, then no crash, but this might be noise, since I don't always have the Welsh choice, but if I just repeatedly "open the menu and then click in the document to dismiss the menu", eventually I get a Welsh choice. Also, with welsh language, it accepts this text as valid (no spelling error: dfklsdfklmsdfklm lk sdfklsdfkl sdflksdfkl sdfkl sdfklsdfklsdfklsdfkl sdlkfsdkl I don't speak Welsh, but the probability I randomly hit a string of consonants that are valid Welsh words is quite low :) With French or English language, these words are underlined in red. When doing a spell-check (not only in Welsh) my stdout/err fills with the following, not sure if it is related. warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/ui/dialog/SwSpellDialogChildWindow.cxx:452: ApplyChangedSentence in initial call or after resume warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/ui/dialog/SwSpellDialogChildWindow.cxx:452: ApplyChangedSentence in initial call or after resume warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.osl:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/sw/source/core/edit/edlingu.cxx:1364: TODO: add ignore mark to text node warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:719: !! Grammarchecker failed to provide end of sentence !! warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:719: !! Grammarchecker failed to provide end of sentence !! warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:719: !! Grammarchecker failed to provide end of sentence !! warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:704: nSuggestedEndOfSentencePos calculation failed? warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:719: !! Grammarchecker failed to provide end of sentence !! warn:legacy.tools:9750:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/linguistic/source/gciterator.cxx:740: end-of-sentence detection failed?
a) "Note that the locale's decimal separator key is comma, not dot. So the "same as locale setting" is weird, it says it is a dot. If I uncheck that box, I don't see a way to choose between dot or comma." That affects input only IIRC. Basically I think the history is that in Spain (or someplace or other) the numeric keypad on their keyboard has a . on it but everyone wants it to produce a , and not a . so it should only affect input using the numeric keypad, toggles between what's written on the keyboard vs what locale data says is that locales separator. b) "with welsh language, it accepts this text as valid (no spelling error:" You presumably don't have a welsh spellchecking dictionary installed. Format->character->welsh will have a abc+blue check if it has welsh spelling support. c) icu 4.4.1 I reckon that's the root of the problem, there's apparently a bug in 4.4.1 where the icu number formatter generates not-a-number text when formatting various large numbers (or what it thinks are large numbers) and our original code didn't handle any unexpected events along those lines. defe079d455ccc958fd0128e8a8cf0e4aeb5cd9c + a05357ab69712bec53c2d8d17efbbf25907ff9b8 v. likely fixes this. defe079d455ccc958fd0128e8a8cf0e4aeb5cd9c is already in 3-6 now I believe. if you could test if a05357ab69712bec53c2d8d17efbbf25907ff9b8 fixes this for you with icu 4.4.1 and cherry-pick it if does that'd be good.
(In reply to comment #6) > b) "with welsh language, it accepts this text as valid (no spelling error:" > You presumably don't have a welsh spellchecking dictionary installed. > Format->character->welsh will have a abc+blue check if it has welsh spelling > support. Can't find any language list in Format / character. In "Tools / Language / for all text", none of the listed languages has an "abc+blue check". Given the very limited list, I assumed it listed only the languages for which I have an appropriate spellchecking dictionary. I assume you meant Tools / Options / Language settings / Default language for documents / Western? I have the mark for fr_BE, fr_CA, fr_LU, fr_FR, fr_monaco, fr_CH and 16 different en_* entries. But my "Tools / Language / for all text" has, at startup, only en_US and fr_LU? Today, I started LibO, I had only fr_LU and en_US. After looking in that menu about 5 times, "Irish" (and not anymore "Welsh") appeared. I notice that if I take "more" in "Tools / Language / for all text", Irish is the default for new documents, but "for current document" is checked. Not sure I understand what's happening here. That might be worth its own bug report.
(In reply to comment #6) > c) icu 4.4.1 > I reckon that's the root of the problem, there's apparently a bug in 4.4.1 (...) > defe079d455ccc958fd0128e8a8cf0e4aeb5cd9c + > a05357ab69712bec53c2d8d17efbbf25907ff9b8 > > v. likely fixes this. defe079d455ccc958fd0128e8a8cf0e4aeb5cd9c is already in > 3-6 now I believe. if you could test if > a05357ab69712bec53c2d8d17efbbf25907ff9b8 fixes this for you with icu 4.4.1 and > cherry-pick it if does that'd be good. Found a05357ab69712bec53c2d8d17efbbf25907ff9b8 already pushed to libreoffice-3-6; pulled and rebuilt, verified as fixed. I get on stderr: warn:i18npool:21859:1:/home/master/src/libreoffice/workdirs/libreoffice-3.6/i18npool/source/ordinalsuffix/ordinalsuffix.cxx:132: ordinal NaNe didn't start with expected 263.283.153 prefix but my understanding is that this is "normal".
we're straying off topic :-) a) "Can't find any language list in Format / character". In writer, in the format->character dialog, under the font tab, you must have a "Language" list. But it's the same list as Tools / Options / Language settings / Default language for documents, so looking there will suffice. The dictionary extensions that are built are a subset based off the --with-lang option IIRC, though Linux builds may default to also looking at whatever hunspell dictionaries you happen to have installed in /usr/share/hunspell to use in addition to the dictionary extensions. b) "Tools / Language / for all text", this is for changing the language that the text claims to be written in. Its trick is that it lists the language the text is set to, the default document language *and* the language that libexttextcat guesses the language might really be. So if you, in a French speaking locale write some German text it should (if the text is long enough for it to guess) list German as an option to change the language of that selection to. So that feature doesn't care about what spellchecking dictionaries are installed. c) the toggle defaults to not changing the default language for all new documents. UI is sort of poor there alright.