Description: In German text, words or phrases containing apostrophes with no adjacent spaces are incorrectly autocorrected. Common examples are "ist's" (short for "ist es"), "Andrea's" (similar to English use), "D'dorf" (short for "Düsseldorf"). The correct typographic symbol would be the English *right* single quotation mark (U+2019). When single quotes autocorrection is disabled, the typewriter-style 'apostrophe' (U+0027) is inserted (the ASCII character to which the apostrophe key is traditionally mapped). When single quotes autocorrection is enabled, the English *left* single quotation mark (U+2018) is inserted, or whatever the user has configured for the single 'end quote'. Steps to Reproduce: 1. Open LibreOffice Writer, create a German-language document 2. Make sure 'autocorrect while typing' is enabled 3. Type words or phrases containing apostrophes like ist's, Andrea's, D'dorf Actual Results: If single quotes autocorrection is disabled, the character inserted into the text is the typewriter-style, non-typographic apostrophe symbol (U+0027). If single quotes autocorrection is enabled, the character inserted into the text is the unicode character 'LEFT SINGLE QUOTATION MARK' (U+2018), or the character defined by the user as the 'end quote' for single quotes autocorrection, depending on whether there is a user-defined replacement or not. Only three specific phrases, and even these only if single quotes autocorrection is disabled, finally will have the apostrophe (U+0027) replaced with the correct symbol (U+2019) – "geht's", "gibt's" and "wird's" – because these three replacements are explicitly included in DocumentList.xml file within the acor_de.dat archive. Expected Results: The apostrophe being inserted is the unicode character 'RIGHT SINGLE QUOTATION MARK' (U+2019). Reproducible: Always User Profile Reset: Yes OpenGL enabled: Yes Additional Info: Three remarks: 1. Even when used as an apostrophe, it's surely sensible to not replace the U+0027 apostrophe character with a typographic character if such an autocorrection would not be wanted, but the enabled/disabled state of 'single quotes' autocorrection is not the proper place, semantically, for deciding what is to be done with a genuine apostrophe. For a user, changing something within 'single quotes' autocorrection settings should not have an effect on how genuine apostrophes are handled. The only real solution to this problem might be to include a third, separate settings option for 'apostrophes' beside 'single quotes' and 'double quotes'. 2. Sometimes an apostrophe can be used at a word's end (for marking the genitive of nouns ending in s, ß, z, x: "Delacroix' Gemälde", the painting by Delacroix). If we get a correct typographic character there, it's only because by pure chance the currently used 'end quote' for 'single quotes' is the same symbol. Depending on the overall typography someone wants to use, this does not need to be the case. 3. To the best of my knowledge, even the implementation for English text (I tried English-UK) solves the problem only partly. Autocorrection of the U+0027 character used as an apostrophe within a word is dependent on whether "single quotes" autocorrection is enabled or not, too. Only if this is enabled, and only if the user has not defined a custom "end quote" for "single quotes", autocorrection correctly inserts U+2019 as the typographic apostrophe symbol. To get the same behaviour for German text would already improve things, even if it wouldn't solve the underlying problem, which is the behaviour of apostrophes being dependent on the settings for single quotes. ---------- Version: 6.3.3.2 Build-ID: 1:6.3.3-0ubuntu0.18.04.1~lo1 CPU-Threads: 4; BS: Linux 5.3; UI-Render: Standard; VCL: gtk3; Gebietsschema: de-DE (de_DE.UTF-8); UI-Sprache: de-DE Calc: threaded
Created attachment 155897 [details] Image of correct and incorrect typographic apostrophe
Do you have a suggestion how to distinguish the situation, where an apostrophe (’ U+2019) has to be written, from the situation, where a single ending quotation mark (‘ U+2018) is needed? Especially, if you have started a single opening quotation and now want e.g "Hans’ Auto". I know no way to do it. If you need the signs often, you should learn the code points and use the "Toggle Unicode" feature of LibreOffice, or define a macro and shortcut for it, or adjust the keyboard layout to get them directly.
I'm aware of the difficulty in distinguishing apostrophes at the end of a word from (single-quote) end quotes, which is why I included the case only as a remark, while the subject of this bug report is apostrophes *inside* a word or phrase, without an adjacent space or other delimiter character. An apostrophe character inside a word can and should, by default, always be interpreted as an apostrophe, not an end quote. Which is what the English implementation is already doing correctly - if only as long as default single quote autocorrection is used. (And once we had the notion of an 'apostrophe' vs. a 'quote', it would offer some options for at least recognizing some apostrophes at the end of words. A criterion, for example, could be the absence of any (single-quote) start-quote character before the apostrophe character in the document or paragraph.)
Seems to me current handling of typographic 'single quotes' is correct to function--use of opening and closing quotation. An immediate <Ctrl>+z or <Esc> will revert any unwanted autocorrect while typing. And as noted entries in the autocorrect Replacement strings data file by locale (of the Paragraph being edited) will take precedence over the option corrections of single and double quotes while typing or modifying content. But, seems it could be a reasonable enhancement to the edit engine where it should be possible to set additional logic testing for entry of second "'" (0x0027). Maybe as simple as: "more than two words and it gets handled as a quotation, one or two words and it is an apostrophe"--and receiving a locale preferred typographic glyph for apostrophe. Though probably just the QuotationEnd of the localedata/data/[locale] =-notes-= Related issue for the fr-CH users in bug 116062 where we in the i18npool/source/localedata/data/fr_CH.xml we reverted correct Swiss national "‹" (0x2039) & "›" (0x203a) to "‘" (0x2018) & "’" (0x2019) for QuotationStart & QuotationEnd to restore use of the apostrophe. While for bug 1115382 László tweaked apostrophe usage for the -HU locales via the Autocorrect logic. Likewise we have some issues with RTL scripts, eg. bug 114575, where we reverse the order of Starting and Ending. Or as in bug 114184 needing support in Hebrew for its Geresh and Mercha.
> Seems to me current handling of typographic 'single quotes' is correct to > function--use of opening and closing quotation. An immediate <Ctrl>+z or > <Esc> will revert any unwanted autocorrect while typing" This report is not about 'single quotes' handling being incorrect, it is about 'apostrophes' not being properly handled. Apostrophes are routinely recognised as a single quote (end quote), which is incorrect, and reverting the autocorrection still does the wrong thing - it reverts back to the a typewriter-style x0027 character, where a typographic apostrophe symbol would be correct.
(In reply to Rob Schroeder from comment #5) > > Seems to me current handling of typographic 'single quotes' is correct to > > function--use of opening and closing quotation. An immediate <Ctrl>+z or > > <Esc> will revert any unwanted autocorrect while typing" > This report is not about 'single quotes' handling being incorrect, it is > about 'apostrophes' not being properly handled. Apostrophes are routinely > recognised as a single quote (end quote), which is incorrect, and reverting > the autocorrection still does the wrong thing - it reverts back to the a > typewriter-style x0027 character, where a typographic apostrophe symbol > would be correct. And exactly which Unicode glyph would be the "typographic apostrophe symbol" that would be "correct"--and for which locale? Only the 0x0027 is an APOSTROPHE and has its own Unicode glyph, everything else requires a Unicode glyph be substituted--a single quote (opening or closing), or other locale appropriate glyph--depending on locale. We handle as quotes (single or double), or with Autocorrect disabled as apostrophe (and use of 0x0027 as drawn fro the font in use for the paragraph). To be precise what is required is additional editengine, or autocorrect, logic to handle keyboard input of the Apostrophe keysym (0x0027) with more options than as a single quotation (opening or closing). Likewise for the Quotation Mark keysym (0x0021). Can't do it now, but it is Not a Bug, so => Enhancement requiring dev effort.
> And exactly which Unicode glyph would be the "typographic apostrophe symbol" > that would be "correct"--and for which locale? This has already been answered by the Unicode consortium, independent of locale. Originally, U+02BC 'MODIFIER LETTER APOSTROPHE' was considered as the preferred character for a punctuation apostrophe, but since Unicode v.3.0.0 the U+2019 'RIGHT SINGLE QUOTATION MARK' is considered as the preferred character - see https://en.wikipedia.org/wiki/Modifier_letter_apostrophe. > Can't do it now You already do, in en_?? locales. I repeat, the above is what LibreOffice already does in en_?? locales when an apostrophe is typed inside a word, i.e. with no delimiter character following it (if the user didn't specify and enable custom single quote characters, and this limitation is actually a bug, too, while I guess developers wanted to keep it like that as long as there is no option to specify and enable a custom 'apostrophe' character, too, which then would be used in place of U+2019). Now my report is about de_DE, where basically the same rules apply except for some differences in the default characters for 'single quotes', and here LibreOffice doesn't do it - it uses U+2018 'LEFT SINGLE QUOTATION MARK' when an apostrophe is typed inside a word, which is always wrong. Which is why this is a bug.
German typographical apostrophe is U+2019, looks like an upper comma. There are some ideas now: A Detect, if the to be replaced straight apostrophe U+0027 is not inside a single quotation and not after a starting single quotation. I think, that might work for entering new text at the end of a document, but it will be difficult when edit an existing text. B If the to be replaced U+0027 sign is inside a word, then replace it with a typographical apostrophe. That would not catch all cases for a typographical apostrophe, but would be better than the current situation. C Provide a short cut for entering a typographical apostrophe, independent from keyboard layout and locale. Same for other characters, needed for other languages, perhaps allow user to customize it. I know, that Linux has means to customize keyboard layout, but those do not exist on Windows. I hope, Eike can tell, which is indeed a practical idea.
> There are some ideas now Thanks, sounds like a good start. Perhaps it might help to find out what the existing code already does for en_??, but doesn't do for de_DE? (I'm doing a git clone right now, but I fear it will take substantially longer for me even to find the relevant code than it will take for the pros here to fix it - the more so since I'm not fluent in C++...)
@Eike, do you have any opinion on this issue ?
Likely the best option to replace the default single quote replacement with default curly apostrophe usage.
[László Németh:] > Likely the best option to replace the default single quote replacement with > default curly apostrophe usage. Just changing glyphs within the existing logic won't solve this, it will just lead to different errors in typography. As I understand the logic behind what goes on in the code, there needs to be code to decide whether the apostrophe someone types on their keyboard is to be interpreted as 'single end quote' or as 'apostrophe'. The code already exists. The implementation of autocorrection for English locales has such code. It is not perfect, but works for the vast majority of cases. It's just that the German locale is missing that piece of code (or possibly even just the call to that piece of code), which is why when someone types an apostrophe following a letter character, it's always the glyph for 'closing single quote' that will be inserted into the text, never the (curly) apostrophe. It would not really help if it was the other way round, though.
*** Bug 132985 has been marked as a duplicate of this bug. ***
László Németh committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/a0c90f1bccd9b5a349d3199746facab549f27dba tdf#128860 AutoCorrect: fix apostrophe in Czech, German, It will be available in 7.1.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
László Németh committed a patch related to this issue. It has been pushed to "libreoffice-7-0": https://git.libreoffice.org/core/commit/c3ef223ba5f893f8096d205ef09b5f5262ab6baa tdf#128860 AutoCorrect: fix apostrophe in Czech, German, It will be available in 7.0.0.1. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
(In reply to Rob Schroeder from comment #12) The recent solution – checking preceding quote marks – fixes most of the problems. A possible/remaining improvement could be to handle apostrophe usage within second level quotations, based on the direct continuation of the word after the apostrophe, i.e. fixing „… ‚word‘s as „… ‚word’s... Thanks for your bug report and help!
Works in libreoffice-7-0_2020-06-08_06.00.18_LibreOfficeDev_7.0.0.0.beta1 (.deb, Linux Mint). Excellent, thanks!
(In reply to Rob Schroeder from comment #17) > Works in libreoffice-7-0_2020-06-08_06.00.18_LibreOfficeDev_7.0.0.0.beta1 > (.deb, Linux Mint). Excellent, thanks! Thanks for verification!
*** Bug 131515 has been marked as a duplicate of this bug. ***