Bug 45271 - : The way Writer counting Chinese words is not correct
Summary: : The way Writer counting Chinese words is not correct
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
(earliest affected)
3.5.0 RC2
Hardware: Other All
: highest major
Assignee: Caolán McNamara
Whiteboard: BSA target:3.6.0
Depends on:
Reported: 2012-01-26 07:12 UTC by TaiJan Huang
Modified: 2012-04-05 06:07 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:

helpful demo of what's apparently expected behaviour (27.48 KB, application/vnd.sun.xml.writer)
2012-03-28 14:00 UTC, Caolán McNamara

Note You need to log in before you can comment on or make changes to this bug.
Description TaiJan Huang 2012-01-26 07:12:03 UTC
Problem description: 

The way of counting Chinese words in the word count function of Libreoffice is not suitable (and even useless) to Chinese speakers.  

In Chinese, we count "character" instead of "word." For example, 蘋果 (a.k.a. apple) is a "word" with two characters. In fact, the meaning of "word" in Chinese equals the meaning of "character" in English.  

Therefore, the following word count should be three words:
" 蘋果  apple"

But the current behavior of word count would show two words because it counts 蘋果 as one word while it should be counted as two words in Chinese.

This is a longstanding bug for Chinese speakers.


Browser: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0
Comment 1 sasha.libreoffice 2012-03-15 07:37:19 UTC
@ TaiJan Huang
Thanks for bugreport. It reproducible in 3.5.1

@ Andras
Please, take look at this bug when will have time. Here word count is wrong for Chinese. What you think about this?
Comment 2 Caolán McNamara 2012-03-28 14:00:49 UTC
Created attachment 59176 [details]
helpful demo of what's apparently expected behaviour
Comment 3 Not Assigned 2012-04-05 06:04:35 UTC
Caolan McNamara committed a patch related to this issue.
It has been pushed to "master":


Resolves: fdo#45271, i#17964 count CJK words the way that's expected by users