Bug 91662 - Incorrect word count for common symbols without spacing
Summary: Incorrect word count for common symbols without spacing
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.7.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Word-Count
  Show dependency treegraph
 
Reported: 2015-05-27 10:00 UTC by mosteo
Modified: 2022-01-26 13:50 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
ten words that are counted as seven (9.61 KB, application/vnd.oasis.opendocument.text)
2022-01-26 13:50 UTC, Stéphane Guillou (stragu)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mosteo 2015-05-27 10:00:04 UTC
If someone writes with improper spacing, word count will fail for examples like these:

bla?bla?

Hello.Bye.

These are counted as one word in each line, which is not how a person would count them. I've found this problem when auditing texts for a short-stories contest, leading to possible disqualification.

Other editors (google docs, abiword, gedit) report the (human) expectation. Second hand reports tell me that Microsoft Word behaves as Libre Office, I don't have one to check.
Comment 1 Buovjaga 2015-06-10 14:27:20 UTC
(In reply to mosteo from comment #0)
> Other editors (google docs, abiword, gedit) report the (human) expectation.
> Second hand reports tell me that Microsoft Word behaves as Libre Office, I
> don't have one to check.

I guess this a sound argument. Let's set to NEW and hope for the best. Changed to enhancement as well.
Comment 2 Stéphane Guillou (stragu) 2022-01-26 13:50:27 UTC
Created attachment 177807 [details]
ten words that are counted as seven

Still the case in:

Version: 7.3.0.2 / LibreOffice Community
Build ID: f1c9017ac60ecca268da7b1cf147b10e244b9b21
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded