Bug 96197 - Korean text to Wrap in the middle of a word
Summary: Korean text to Wrap in the middle of a word
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Mark Hung
URL:
Whiteboard: target:6.0.0
Keywords:
Depends on:
Blocks: CJK
  Show dependency treegraph
 
Reported: 2015-12-02 06:54 UTC by mailhyoon
Modified: 2017-12-09 01:22 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
MS Office 2010 Word "text to wrap in the middle of a word" option (34.60 KB, image/png)
2015-12-02 06:54 UTC, mailhyoon
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mailhyoon 2015-12-02 06:54:25 UTC
Created attachment 120946 [details]
MS Office 2010 Word "text to wrap in the middle of a word" option

While typing in Korean. I find a strange behavior that must be an error.
When I type a long word near the end of a line. The word just wrap in the middle.

I have attached a MS Office 2010 Word example below as a picture attachment.
If you see the Tab "Asian Typography" under Format Paragraph
Clicking "Allow Latin text to wrap in the middle of a word" (a Red Arrow on Left)
Allows a word to break in the middle (Red arrows on Right), but obviously more natural expected result would be like below picture, where the word will start on a new line leaving spaces on the previous line.

If this was English above line would look like:
----------------
Allows a word to break in the middle (Red arrows on Right), but obv
iously more natural expected result would"
---------------
This is NOT natural. I have spent several hours to find format, options, autocorrect, customization, etc..., but failed to find the solution. Plus, obvious searches through the web.

Solution would be
1) Do not let the word break in the middle when near line end.
2) Better yet: Let user have choice between the two formats like MS Word or Hancom Word.

Thank you much,
Comment 1 mailhyoon 2015-12-02 15:19:01 UTC
I attempted to a little further investigation.
On a blank new doc I pasted following from the web
애국가의 가사는 1900년대초에 쓰였으며, 작시자는 공식적으로는 미상이라고 적혀있다. 작사자에 대한 설은 크게 윤치호라는 설과 안창호라는 설 두 가지가 있다. 작사자 윤치호 설은 윤치호가 애국가의 가사를 1907년에 써서 후에 그 자신의 이름으로 출판했다는 것이다. 한편 안창호가 썼다는 주장은 안창호가 애국가를 보급하는 데에 앞장섰다는 데에 중점을 두고 있다. 1908년에 출판된 가사집 《찬미가》에 수록된 것을 비롯한 많은 일제 강점기의 애국가 출판물은 윤치호를 작사자로 돌리고 있는 등 윤치호 설에는 증거가 많은 반면[3] 안창호 설에는 실증적인 자료가 부족하다.

Under certain conditions, Writer correctly split the words when I clicked
Format / Paragraph / Asian Typography [x] Apply spacing between Asian, Latin and complex text.
But many circumstances it did not. I am not sure what makes this work at certain times, as I attempted to clear all formatting from the text and reshaped margins to see if it behaves with a pattern, but mostly it does not work correctly.
Comment 2 Buovjaga 2015-12-04 19:53:10 UTC
Can you get someone from the Korean community to confirm this? https://wiki.documentfoundation.org/Local_Mailing_Lists#Korean

Thanks.
Comment 3 Jung-Kyu Park 2015-12-05 20:02:26 UTC
Hello, 
This bug affected me with the same issue.
So, I can confirm this issue.
thanks.
Comment 4 Gwangyeon 2015-12-12 01:53:43 UTC
I tried to commit the function but I failed. So, I can confirm this as an issue.
The issue was occured problem that whenever I tried to block work cutting.
Comment 5 Mark Hung 2016-10-21 11:10:29 UTC
Hi,

I do not understand how to reproduce this issue and the expected behavior. Is this a request for feature enhancement, or a statement about unexpected behavior?

Please provide steps to reproduce the issue, with sample document attached if possible, expected result, and actual seen result. Note that most of people here don't read Korean and don't know the justification rules in your culture. So please try to point out the difference between expected and actual result so that others can help.
Comment 6 Hiunn-hué 2017-03-16 20:06:05 UTC
I am not Korean user,  but if I understand it correctly,  in MS Office Word,  a Korean word/phrase (which is embraced by spaces) is  NOT  divided into two parts on line-breaking by default.  If users want the line-breaking to be at the middle of a word, they can enable it in the paragraph setting dialog (see the attached screenshot from mailhyoon).

Further description: https://www.w3.org/TR/klreq/#line-break  (4.4.2 Line Breaking Rules)


It seems like LibreOffice doesn't provide this feature yet. When a long Korean word comes to the end of line, it will always split up.  User should be allowed to decide the line-breaking mode.


► Steps to test:

1. Copy and paste the following Korean text to Writer:
> 애국가의 가사는 1900년대초에 쓰였으며, 작시자는 공식적으로는 미상이라고 적혀있다.

2. Change font size, make one word/phrase to be at the end of line.

3. Let't take "미상이라고" for example.


► Expected Behavior:

"미상이라고" should not split up, and therefore look like ...

> 애국가의 가사는 1900년대초에 쓰였으며, 작시자는 공식적으로는
> 미상이라고 적혀있다.


► Current Behavior:

"미상이라고" splits up, and therefore looks like ...

> 애국가의 가사는 1900년대초에 쓰였으며, 작시자는 공식적으로는 미상이
> 라고 적혀있다.
Comment 7 Buovjaga 2017-03-16 20:14:30 UTC
Mark: see above.
Comment 8 Xisco Faulí 2017-10-01 23:10:08 UTC
Commit in master: https://gerrit.libreoffice.org/#/c/42987/
Comment 9 Commit Notification 2017-10-04 10:55:01 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=441fded7f7fc8a2564075406933226a6eea73dd1

tdf#96197 do not break Korean words in the middle.

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2017-10-04 11:09:14 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=9a64a82a50c66c720fca79a2aaafff7d2cb6db58

Change define to inline and donate some spaces, tdf#96197 follow-up

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2017-10-16 11:30:50 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0c44f702a04db0fffd6884dcb014b28cdff5b21c

tdf#96197 i18npool: don't read beyond end of string

It will be available in 6.0.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Xisco Faulí 2017-11-16 09:32:38 UTC
A polite ping to Mark Hung: is this bug fixed? if so, could you
please close it as RESOLVED FIXED ? Thanks
Comment 13 mailhyoon 2017-11-16 12:57:03 UTC Comment hidden (spam)
Comment 14 mailhyoon 2017-11-16 12:57:03 UTC Comment hidden (spam)
Comment 15 mailhyoon 2017-11-16 12:57:58 UTC
Version: 5.4.3.2
Build ID: 1:5.4.3~rc2-0ubuntu0.16.04.1~lo1
CPU threads: 4; OS: Linux 4.4; UI render: default; VCL: gtk2; 
Locale: en-US (en_US.UTF-8); Calc: group

Does not appears to be fixed.
Comment 16 Xisco Faulí 2017-11-16 12:58:24 UTC
You need to test it with a daily build from http://dev-builds.libreoffice.org/daily/master/