Bug 71329 - No linebreak between Latin text and Ideographic punctuation
Summary: No linebreak between Latin text and Ideographic punctuation
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:24.8.0
Keywords:
: 71331 (view as bug list)
Depends on:
Blocks: CJK Word-Line-Break
  Show dependency treegraph
 
Reported: 2013-11-07 02:16 UTC by shenxiaomao
Modified: 2024-07-23 11:21 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Chinese and English (4.43 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-11-07 02:16 UTC, shenxiaomao
Details
The result in old version of Libreoffice Writer (43.40 KB, image/png)
2015-06-22 11:47 UTC, shenxiaomao
Details

Note You need to log in before you can comment on or make changes to this bug.
Description shenxiaomao 2013-11-07 02:16:14 UTC
Created attachment 88802 [details]
Chinese and English

As the attachment shows ,the style is not good because Writer automatically begin a new line from the middle of the first line。 At the end of the first line the blank is too long。But in MSOFFICE this problem is settlled by corret settings。
I tried  almost every version of Lireoffice and found the  same result。On Windows and linux the result is same.
Comment 1 shenxiaomao 2013-11-07 02:23:38 UTC
Comment on attachment 88802 [details]
Chinese and English

The all substance of the attachment is "由于该厂区新建,部分厂区还没有建好,目前应用的信息化系统还较少,只有PDM、DNC,CAPP、ERP、MES系统暂时没有。从发展趋势来看,该厂会在近几年实施这些信息化系统,因为该厂的资金情况较好,且领导对信息化比较重视。 "
Comment 2 Urmas 2014-04-26 07:16:06 UTC
*** Bug 71331 has been marked as a duplicate of this bug. ***
Comment 3 Urmas 2014-04-26 07:16:41 UTC
Confirmed.
Comment 4 Kevin Suo 2014-04-26 08:36:05 UTC
set to new as there is one duplicate.
Comment 5 QA Administrators 2015-06-08 14:41:21 UTC Comment hidden (obsolete)
Comment 6 shenxiaomao 2015-06-21 13:30:33 UTC
I tried in Libreoffice 4.4.3.2 and found the same result,So the bug is not fixed in Libreoffice 4.4.3.2,the bug is still present on a currently supported version of LibreOffice.
I am going to test if this bug is REGRESSION and give all the attachement.
Comment 7 shenxiaomao 2015-06-22 11:47:08 UTC
Created attachment 116726 [details]
The result in old version of Libreoffice Writer

The result in old version of Libreoffice Writer.And the rusult is same in version 4.4.3.2。The test operation sytem is Win7 32bit.
Comment 8 shenxiaomao 2015-07-26 11:18:21 UTC
There is no such bug in AbiWord,I wish developers can reference the source code of AbiWord。
Comment 9 QA Administrators 2016-09-20 10:17:45 UTC Comment hidden (obsolete)
Comment 10 Mark Hung 2017-07-10 16:22:04 UTC
This is still an issue in 5.3 - Phrases separated by full-width comma or full-width dot are treated as one single word, hence it is put to the next line.
Comment 11 Mark Hung 2017-12-09 06:09:07 UTC
(In reply to Mark Hung from comment #10)
> This is still an issue in 5.3 - Phrases separated by full-width comma or
> full-width dot are treated as one single word, hence it is put to the next
> line.

Correction:

In [1] we detect the language of the last portion to determine the locale for the break iterator. The document under test has "en_US" there and the Unicode break iterator found the incorrect word boundary.

There are few issues:
1. The heuristic rule is wrong in this case.
2. Unicode break iterator didn't break before ideographic punctuation.
3. The word breaking algorithm in UAX29[2] should work for us. Why do we need break iterators for three scripts?

[1] https://cgit.freedesktop.org/libreoffice/core/tree/sw/source/core/text/guess.cxx#n355
[2] http://unicode.org/reports/tr29/#WB5
Comment 12 QA Administrators 2018-12-10 03:40:24 UTC Comment hidden (obsolete)
Comment 13 QA Administrators 2020-12-10 03:47:50 UTC Comment hidden (obsolete)
Comment 14 Ming Hua 2020-12-10 04:13:11 UTC
Still reproducible in 7.1.0 Beta1:
Version: 7.1.0.0.beta1 (x64)
Build ID: 828a45a14a0b954e0e539f5a9a10ca31c81d8f53
CPU threads: 2; OS: Windows 10.0 Build 18363; UI render: default; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Calc: threaded
Comment 15 QA Administrators 2022-12-11 03:20:59 UTC Comment hidden (obsolete)
Comment 16 shenxiaomao 2022-12-22 13:14:04 UTC
I tried in Libreoffice 7.3.7.2 and found the same result,So the bug is not fixed in Libreoffice 7.3.7.2 .
Comment 17 Jonathan Clark 2024-07-23 11:14:41 UTC
This bug was fixed in the following commit:

https://git.libreoffice.org/core/commit/44699b3de37f07090ac6fee1cd97aa76036e9700

tdf#49885 BreakIterator rule upgrades
Comment 18 Xisco Faulí 2024-07-23 11:21:39 UTC
(In reply to Jonathan Clark from comment #17)
> This bug was fixed in the following commit:
> 
> https://git.libreoffice.org/core/commit/
> 44699b3de37f07090ac6fee1cd97aa76036e9700
> 
> tdf#49885 BreakIterator rule upgrades

Hi Jonathan,
Do you think it needs a unittest or it's already covered ?