Step to reproduced: 1.Open a blank Writer document. 2.Switch keyboard to Thai. 3.Type text “มิถุ” by press “,b5”. 4.The function autocomplete will display the full word. Expect result: This function should display the text as “มิถุนายน”. Real result: It displays the text with the second character doubled and overlapped. (See the pictures) This problem occurs when some text is splitted into 2 runs. if a non-spacing vowel mark (e.g. ุ) is placed at the first character of the second run, the text will be displayed overlap. But the problem will occur in the first (regarding position in the document) autocompleted word in a document. I've pasted 2 snipplets of content.xml to show examples:- A: Western autocomplete case: <text:p text:style-name="Standard">January</text:p><text:p text:style-name="Standard">february</text:p> B1: Thai autocomplete case with overlapped text: <text:p text:style-name="Standard">มิถ<text:span text:style-name="T1"> ุนายน</text:span></text:p><text:p text:style-name="Standard"><text:span text:style-name="T1">มิถุนายน</text:span></text:p> B2: Thai autocomplete case without problem: <text:p text:style-name="Standard">กรก<text:span text:style-name="T1"> ฎาคม</text:span></text:p><text:p text:style-name="Standard"><text:span text:style-name="T1">สิงหาคม</text:span></text:p> You will see that, in case B1 and B2, the first autocompleted word (e.g. "มิถุนายน") is splitted before the vowel mark (ุ). But in the second autocompleted word (e.g. "มิถุนายน"), the entire autocompleted word is placed in the text:span element. OOo seems not to be able to display non-spacing mark at the begining of a text run so that's why the text is displayed overlapped. Note that the splitted autocompleted word always happend if you insert the word before any other autocompleted word. That is, it happend only for the first occurance of such autocompleted words.
Created attachment 43411 [details] Both lines are autocompleted. Only the first occurence will display double chracter.
Because the autocorrect seems to only collect words that are typed, and not pasted in, what are the keystroke to generate the full มิถุนายน string ?
I did a related fix around some of this code here: commit 4635182111d70e7c0dfc3a2fc1de61846c82e8a1 Author: Michael Meeks <michael.meeks@novell.com> Date: Fri Dec 3 20:26:24 2010 +0000 autocomplete using the context's case i#22961 22 2 sw/source/ui/docvw/edtwin.cxx possibly breaking CTL of course ;-)
Created attachment 43953 [details] The experimental patch :-
I experiment with the following patch. And then try the same auto-complete test case above. The result content.xml is shown below. The bug is happen to be solved. I'm wondering why, when Writer autocomplete some text, it must specify the language. Why it is needed?.
Created attachment 43954 [details] The result content.xml :-
A: content.xml from LO before applying the experimental patch (see the next attachment) <style:style style:name="T1" style:family="text"> <style:text-properties style:language-complex="th" style:country-complex="TH"/> </style:style> ....snip..... <text:p text:style-name="Standard">มิถ<text:span text:style-name="T1"> ุนายน</text:span></text:p> B: after applying the experimental patch, there's no auto-stype T1 <text:p text:style-name="Standard">มิถุนายน</text:p> I'm wondering, in the first case, why the auto-style T1 to switch to locale th_TH is needed at all?
Created attachment 44017 [details] The bugdoc, odt with overlapped autocomplete text
Created attachment 45511 [details] Experimental patch to see whatif we don't SetLanguage() when insert string from autocorrect? The result is that this bug is gone. We don't see any side-effect yet. So I am wondering whether SetLanguage() before inserting string from autocorrect is needed at all, and why? Because the user is typing in the same language as the autocorrect string anyway. Why switch to the same language?
oky doky. Windows and Linux IMs work differently. The windows one provides the language used, while the Linux ones don't. Hence why I couldn't reproduce this. With some hackery in place I can see this problem now.
So I think comment #9 is basically correct. I don't have windows running here at the moment, so I'm going to speculate... On a Thai system with a Thai IM (Input Method) then no explicit language is forced onto text entered through the IM as the defaults are considered to be ok. When the autocorrect is triggered, it asks for the current IM language which is Thai and forces it onto the accepted output which gives two runs, a "standard" run which is determined to be thai because its got CTL chars in it and the doc default CTL language is thai, and another run which is explicitly thai, and this triggers some other bugs down the line. So, would the following patch work. Follows the same pattern as #9, but moves the chunk of code that determines the "should language be forced to be the same as the current windows IM" when normally typing text into a shared location and reuses that logic for setting the language of the autocomplete text.
Created attachment 47481 [details] does this work for you ?
Yes. It's working fine here. We'll doing more testing anyway.
Let's assume this is the correct approach, and open a new bug (against me) if there turns out to be more side-effects
I've seen the patch in master and tested it. It works fine. However, the Thai users still face the problem because the patch isn't in 3.4.x. Can someone pick the patch for the next releases?
can you send a mail to the dev mailing list if you want to propose this for 3-4. [REVIEW] is the usual header, and mention the bug id and that you want to propose it for backporting/cherrypicking to the 3-4 series