Description: I'm looking for a way to merge two words together by using the autocorrect function in Writer. I use the Polish language in my work) My aim is to type a word, add the space (so that the word is corrected if necessary, eg. diacritics are added, etc.), and then type a suffix as a separate word. The suffix should be autocorrected (again, dicritics, etc.) and the space prceding it removed, as if by hitting the Backspace key. The reason for this request is that I'm making a huge list of autocorrect definitions. Polish is chock-full of diacritics, so I'm using autocorrect to be able to type without thinking about them, letting the software auto-add the various dots and squiggles where necessary. To make my task even more complicated, Polish is also an inflected language, which means that every one of these words has multiple variants depending on their flection. Adding a suffix on top of this (each suffix taking a different form depending on flection) leads to a huge number of necessary autocorrect definitions for one word in all its forms. What I'd like to do is: 1. define for autocorrect purposes the base form of a word (that is eg. change mogl to mógł). — This is not a problem, as autocorrect already does it. 2. define a suffix after a space (eg. bys) that will be autocorrected (eg. byś), but ALSO the space before it will be removed — so that the suffix is joined with the preceding word, resulting in: mógłbyś. It seems removing that space could be done by regex, which autocorrection currently does not support, as aadvised to me by another user here: https://ask.libreoffice.org/t/can-you-auto-delete-the-space-before-an-auto-corrected-word/101757. Steps to Reproduce: 1. In autocorrect options (I use Polish language) define: mogl to be autocorrected to: mógł. 2. In autocorrect options define: bys (that's [space]bys) to be autocorrected to: byś (that is [no space]byś. 3.Close the options. In Writer document type: mogl bys. Actual Results: After the above steps the actual result is: mógł byś (diacritics corrected properly, space persists) Expected Results: I'd like the result to be: mógłbyś (diacritics in both words corrected properly, space removed — words are joined) Reproducible: Always User Profile Reset: No Additional Info: Version: 7.6.4.1 (X86_64) / LibreOffice Community Build ID: e19e193f88cd6c0525a17fb7a176ed8e6a3e2aa1 CPU threads: 4; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win Locale: en-GB (en_GB); UI: en-GB Calc: CL threaded
Interesting, just not sure it is feasible given the simple list 2-tuples for autocorrect entry and its replacement string. Would require substantial dev effort.
Thank you for looking into it, Stuart. Actually the list of such 2-tuples in Polish would be massive, as -byś, -bym, -by, -byście, -byśmy are super common suffixes that you add to verbs in the past tense to create conditional verbs. Then there are suffixes that result in participles: verb + -jąc/-jący/-jącym/-jącego/-jącemu/-jąca/-jącą/-jącej/-jące/-jącym/-jących/-jącymi (all of which are sort of equivalent of -ing) You can see how long a list of definitions is required for a single verb - and that's only for two suffixes. It would be soooo much easier and more efficient if I defined autocrrection for as many verbs as possible, and had a finite list of such suffixes that I could attach as I type. At the moment I've just found a way around it - I type the verbs and their suffixes separately, and then, after a day's work I Ctrl+H them: find: [space]-byś (etc.) / replace: [w/o space]byś Than kind of does the job, but severely obstructs the flow. As a writer I'd be super grateful for smoothing it out. I'm positive that the option to autocorrect words while at the same time deleting the space preceding it - effectively joining the autocorrected word with the one preceding it - would find many other uses, too. Thanks again, Mac
László, do you have an idea?
Such a feature will also help Marathi language. The people incorrectly write "घरा चा" or "घरा ची" (with space) instead of "घराचा" / "घराची" (without space). There should be a parameter to declare such suffixes to be joined with the earlier word. The suffixes like "चा" or "ची" are not accepted words in Marathi and never used separately after space. (like "ed" not a word in English but correct in "worked")
A possible solution is to use a non-space separator, e.g. comma, and the .* pattern to recognize the suffix in the end of the character sequence: mogl -> mógł .*,bys -> byś After typing the comma, mogl changed to mógł. After following with ,bys: mogl,bys -> mógłbyś The other solution to use Hunspell spell checker to add the missing diacritics, accepting the their suggestions automatically. It seems, the dictionary contains all the Polish diacritics, so it can suggest the right alternatives with diacritics: == pl_PL.aff == MAP 8 MAP aą MAP cć MAP eę MAP lł MAP nń MAP oóu MAP sś MAP zżź So with a LibreBasic or pyUNO macro, it's possible to add the missing diacritics automatically (except when the result is ambiguous), e.g. by clicking on a button at the end of the document editing. As a code snippet, see for example the following LibreBasic code snippet from https://forum.openoffice.org/en/forum/viewtopic.php?t=1222, using XSpellChecker service of LibreOffice UNO API via com.sun.star.linguistic2.LinguServiceManager: Sub WrongWordsList Dim oDocModel as Variant Dim oTextCursor as Variant Dim oLinguSvcMgr as Variant Dim oSpellChk as Variant Dim oListDocFrame as Variant Dim oListDocModel as Variant Dim sListaPalabras as String Dim aProp() As New com.sun.star.beans.PropertyValue oDocModel = StarDesktop.CurrentFrame.Controller.getModel() If IsNull(oDocModel) Then MsgBox("There's no active document." + Chr(13)) Exit Sub End If If Not HasUnoInterfaces (oDocModel, "com.sun.star.text.XTextDocument") Then MsgBox("This document doesn't support the 'XTextDocument' interface." + Chr(13)) Exit Sub End If oTextCursor = oDocModel.Text.createTextCursor() oTextCursor.gotoStart(False) oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager") If Not IsNull(oLinguSvcMgr) Then oSpellChk = oLinguSvcMgr.getSpellChecker() End If If IsNull (oSpellChk) Then MsgBox("It's not possible to access to the spellcheck." + Chr(13)) Exit Sub End If Do If oTextCursor.isStartOfWord() Then oTextCursor.gotoEndOfWord(True) ' Verificar si la palabra está bien escrita If Not isEmpty (oTextCursor.getPropertyValue("CharLocale")) Then If Not oSpellChk.isValid(oTextCursor.getString(), oTextCursor.getPropertyValue("CharLocale"), aProp()) Then sListaPalabras = sListaPalabras + oTextCursor.getString() + Chr(13) End If End If oTextCursor.collapseToEnd() End If Loop While oTextCursor.gotoNextWord(False) If Len(sListaPalabras) = 0 Then MsgBox("There are no errors in the document.") Exit Sub End If oListDocFrame = StarDesktop.findFrame("fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.ALL) If IsNull(oListDocFrame) Then oListDocModel = StarDesktop.loadComponentFromURL("private:factory/swriter", "fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.CREATE, aProp()) oListDocFrame = oListDocModel.CurrentController.getFrame() Else oListDocModel = oListDocFrame.Controller.getModel() End If oTextCursor = oListDocModel.Text.createTextCursor() oTextCursor.gotoEnd(False) oListDocModel.Text.insertString (oTextCursor, sListaPalabras, False) oListDocFrame.activate() End Sub And the other code snippet to modify the wrong words (but also fixing a problem in XSpellChecker usage that has since been solved): https://forum.openoffice.org/en/forum/viewtopic.php?p=425651
Awesome! László's solution works for me like a charm. Thank you.
(In reply to Mac from comment #6) > Awesome! László's solution works for me like a charm. Thank you. Let me add that this solution works just as well with prefixes. In that case the .* pattern needs to be used after the prefix and the non-space separator (in this example comma): e.g.: pol,.* -> pół zalezny -> zależny After typing the comma, pol changed to pół. After following with any word that word is joined to pół. In this case: pol,zalezny -> półzależny --- You can even create long words with both prefixes and suffixes, e.g. by defining in autocorrect the following: pol,.* -> pół (a common prefix, meaning semi-) zalezny -> zależny .*,ch -> ch (one of the many inflectional morphemes) you can type: pol,zalezny,ch -> półzależnych Thank you again! You're the best :)
Actually not exactly - with prefixes whatever is after the prefix doesn't get corrected, but it's still better than nothing. (I'm sorry, there's no option to edit a post - please, feel free whatever parts of what I'm saying you deem irrelevant or muddying.)
This (use of comma) workaround is intended for those who already know the correct spelling. Spell checkers are designed for novice users who may think what they've typed is correct. By the way, I was unaware that this was possible. It has rendered hundreds, if not thousands, of auto-correct entries in the Marathi language pack obsolete. I appreciate your efforts, but what I find frustrating is that it's not adequately documented, especially with all the use cases as mentioned above.
Ususally Close up (Unicode U+2050) ⁐ sign is used by proof readers if there is no need of space. For e.g. if I type "I am work ing hard." in google docs, it suggests "working". Libreoffice writer suggestions are way out of context. If this is unrelated in current context, let me open a new feature request.
(In reply to Mac from comment #8) > ...with prefixes whatever is after the prefix doesn't get corrected... (In reply to Shantanu from comment #9) > This (use of comma) workaround is intended for those who already know the > correct spelling.... > ... I find frustrating is that it's not adequately documented... Since the ticket is still flagged as UX relevant I wonder what's missing. Or should we forward the solution to documentation?
(In reply to Heiko Tietze from comment #11) > (In reply to Mac from comment #8) > > ...with prefixes whatever is after the prefix doesn't get corrected... > > (In reply to Shantanu from comment #9) > > This (use of comma) workaround is intended for those who already know the > > correct spelling.... > > ... I find frustrating is that it's not adequately documented... > > Since the ticket is still flagged as UX relevant I wonder what's missing. Or > should we forward the solution to documentation? Oh, by all means, my initial question/request has been answered beautifully. It's a neat method for such highly inflective language as Polish (and, I gather, Marathi). So, it can go to documentation as a solution for adding suffixes and inflective morphemes that need to be automatically attached to the root word (while also get autocorrected - as a bonus - if there's need for that) --- Since you are asking what is missing - it's the autocorrection of the root word in the middle (the part without the the .* pattern and non-space separator). In other words: prefix[corrected],root[not corrected],suffix[corrected] (the root may stay not autocorrected because the autocorrect function recognises it not as the root alone, but as the chain consisting of [prefix][comma][uncorrected root]. Although I'll have to investigate it once again, because Im almost positive the root got corrected in a few words I initially typed as a test.
(In reply to Mac from comment #12) > --- > Since you are asking what is missing - it's the autocorrection of the root > word in the middle (the part without the the .* pattern and non-space > separator). > > In other words: prefix[corrected],root[not corrected],suffix[corrected] > (the root may stay not autocorrected because the autocorrect function > recognises it not as the root alone, but as the chain consisting of > [prefix][comma][uncorrected root]. Yes, I've tested it and can confirm now that it's how it is at the moment. Someone mentioned before that it would be more difficult than suffixes, but how cool would it be if a user could "tell" the program to treat a particular non-space separator [e.g. comma] like a space. Or maybe is there such space separator already? Something other than comma? (But TBH comma is placed in a very convenient spot on the keyboard...)
[An Edit Comment option would be handy] I meant: Or maybe is there such a space-like separator already?
One more thing worth noting when it comes to prefixes. There's a difference between words whose root doesn't get autocorrected, and words that do (e.g. words with diactitics). --- Example: [prefix] za [roots] baw myśl --- In autocorrect: za,.* -> za mysl -> myśl [baw doesn't need to be autocorrected - no diacritics] --- When I type: za,baw -> zabaw za,mysl -> zamysl [joined, but 'mysl' not corrected to 'myśl']