33092 – Autocomplete display double character for this word [CTL / Thai]

Bug 33092 - Autocomplete display double character for this word [CTL / Thai]

Summary: Autocomplete display double character for this word [CTL / Thai]

Status:	RESOLVED FIXED

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	Writer (show other bugs)
Version: (earliest affected)	3.3.2 release
Hardware:	Other Windows (All)

Importance:	medium normal
Assignee:	Caolán McNamara

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	AutoCorrect-Complete 33891
	Show dependency tree / graph

Reported:	2011-01-13 23:59 UTC by Tantai
Modified:	2016-10-19 23:31 UTC (History)
CC List:	3 users (show)

See Also:
Crash report or crash signature:

Attachments
Both lines are autocompleted. Only the first occurence will display double chracter. (12.03 KB, image/png) 2011-02-15 21:05 UTC, Samphan Raruenrom	Details
The experimental patch :- (530 bytes, patch) 2011-03-01 01:21 UTC, Tantai	Details
The result content.xml :- (2.82 KB, text/xml) 2011-03-01 01:22 UTC, Tantai	Details
The bugdoc, odt with overlapped autocomplete text (8.78 KB, application/vnd.oasis.opendocument.text) 2011-03-02 04:39 UTC, Samphan Raruenrom	Details
Experimental patch to see whatif we don't SetLanguage() when insert string from autocorrect? (1.21 KB, patch) 2011-04-12 04:14 UTC, Samphan Raruenrom	Details
does this work for you ? (11.22 KB, patch) 2011-06-03 08:49 UTC, Caolán McNamara	Details
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tantai 2011-01-13 23:59:06 UTC

Step to reproduced:

1.Open a blank Writer document.
2.Switch keyboard to Thai.
3.Type text “มิถุ” by press “,b5”.
4.The function autocomplete will display the full word.

Expect result:
This function should display the text as “มิถุนายน”.

Real result:
It displays the text with the second character doubled and overlapped.
(See the pictures)

This problem occurs when some text is splitted into 2 runs. if a non-spacing
vowel mark (e.g. ุ) is placed at the first character of the second run, the text
will be displayed overlap. But the problem will occur in the first (regarding
position in the document) autocompleted word in a document. I've pasted 2
snipplets of content.xml to show examples:-

A: Western autocomplete case:
<text:p text:style-name="Standard">January</text:p><text:p
text:style-name="Standard">february</text:p>

B1: Thai autocomplete case with overlapped text:
<text:p text:style-name="Standard">มิถ<text:span text:style-name="T1">
ุนายน</text:span></text:p><text:p text:style-name="Standard"><text:span
text:style-name="T1">มิถุนายน</text:span></text:p>

B2: Thai autocomplete case without problem:
<text:p text:style-name="Standard">กรก<text:span text:style-name="T1">
ฎาคม</text:span></text:p><text:p text:style-name="Standard"><text:span
text:style-name="T1">สิงหาคม</text:span></text:p>

You will see that, in case B1 and B2, the first autocompleted word (e.g.
"มิถุนายน") is splitted before the vowel mark (ุ). But in the second autocompleted
word (e.g. "มิถุนายน"), the entire autocompleted word is placed in the text:span
element. OOo seems not to be able to display non-spacing mark at the begining of
a text run so that's why the text is displayed overlapped.

Note that the splitted autocompleted word always happend if you insert the word
before any other autocompleted word. That is, it happend only for the first
occurance of such autocompleted words.

Comment 1 Samphan Raruenrom 2011-02-15 21:05:17 UTC

Created attachment 43411 [details]
Both lines are autocompleted. Only the first occurence will display double chracter.

Comment 2 Caolán McNamara 2011-02-16 06:55:09 UTC

Because the autocorrect seems to only collect words that are typed, and not pasted in, what are the keystroke to generate the full มิถุนายน string ?

Comment 3 Michael Meeks 2011-02-17 03:27:57 UTC

I did a related fix around some of this code here:

commit 4635182111d70e7c0dfc3a2fc1de61846c82e8a1
Author: Michael Meeks <michael.meeks@novell.com>
Date:   Fri Dec 3 20:26:24 2010 +0000

    autocomplete using the context's case i#22961

22      2       sw/source/ui/docvw/edtwin.cxx

possibly breaking CTL of course ;-)

Comment 4 Tantai 2011-03-01 01:21:20 UTC

Created attachment 43953 [details]
The experimental patch :-

Comment 5 Tantai 2011-03-01 01:21:40 UTC

I experiment with the following patch. And then try the same auto-complete test case above. The result content.xml is shown below. The bug is happen to be solved. I'm wondering why, when Writer autocomplete some text, it must specify the language. Why it is needed?.

Comment 6 Tantai 2011-03-01 01:22:42 UTC

Created attachment 43954 [details]
The result content.xml :-

Comment 7 Samphan Raruenrom 2011-03-02 04:37:37 UTC

A: content.xml from LO before applying the experimental patch (see the next attachment)
<style:style style:name="T1" style:family="text">
   <style:text-properties style:language-complex="th" style:country-complex="TH"/>
</style:style>
....snip.....
<text:p text:style-name="Standard">มิถ<text:span text:style-name="T1">
ุนายน</text:span></text:p>

B: after applying the experimental patch, there's no auto-stype T1
<text:p text:style-name="Standard">มิถุนายน</text:p>

I'm wondering, in the first case, why the auto-style T1 to switch to locale th_TH is needed at all?

Comment 8 Samphan Raruenrom 2011-03-02 04:39:49 UTC

Created attachment 44017 [details]
The bugdoc, odt with overlapped autocomplete text

Comment 9 Samphan Raruenrom 2011-04-12 04:14:43 UTC

Created attachment 45511 [details]
Experimental patch to see whatif we don't SetLanguage() when insert string from autocorrect?

The result is that this bug is gone. We don't see any side-effect yet. So I am wondering whether SetLanguage() before inserting string from autocorrect is needed at all, and why? Because the user is typing in the same language as the autocorrect string anyway. Why switch to the same language?

Comment 10 Caolán McNamara 2011-06-03 06:14:34 UTC

Because the autocorrect seems to only collect words that are typed, and not
pasted in, what are the keystroke to generate the full มิถุนายน string ?

Comment 11 Caolán McNamara 2011-06-03 07:02:04 UTC

oky doky. Windows and Linux IMs work differently. The windows one provides the language used, while the Linux ones don't. Hence why I couldn't reproduce this.

With some hackery in place I can see this problem now.

Comment 12 Caolán McNamara 2011-06-03 08:48:38 UTC

So I think comment #9 is basically correct. I don't have windows running here at the moment, so I'm going to speculate...

On a Thai system with a Thai IM (Input Method) then no explicit language is forced onto text entered through the IM as the defaults are considered to be ok. When the autocorrect is triggered, it asks for the current IM language which is Thai and forces it onto the accepted output which gives two runs, a "standard" run which is determined to be thai because its got CTL chars in it and the doc default CTL language is thai, and another run which is explicitly thai, and this triggers some other bugs down the line.

So, would the following patch work. Follows the same pattern as #9, but moves the chunk of code that determines the "should language be forced to be the same as the current windows IM" when normally typing text into a shared location and reuses that logic for setting the language of the autocomplete text.

Comment 13 Caolán McNamara 2011-06-03 08:49:16 UTC

Created attachment 47481 [details]
does this work for you ?

Comment 14 Samphan Raruenrom 2011-06-04 22:39:18 UTC

Yes. It's working fine here. We'll doing more testing anyway.

Comment 15 Caolán McNamara 2011-06-07 05:18:07 UTC

Let's assume this is the correct approach, and open a new bug (against me) if there turns out to be more side-effects

Comment 16 Samphan Raruenrom 2011-09-30 02:09:21 UTC

I've seen the patch in master and tested it. It works fine. However, the Thai users still face the problem because the patch isn't in 3.4.x. Can someone pick the patch for the next releases?

Comment 17 Caolán McNamara 2011-09-30 12:08:33 UTC

can you send a mail to the dev mailing list if you want to propose this for 3-4. [REVIEW] is the usual header, and mention the bug id and that you want to propose it for backporting/cherrypicking to the 3-4 series