Bug 126629 - Writer counts dashes (soft hyphen, hyphen, and others) as words when en-dash and em-dash are ignored
Summary: Writer counts dashes (soft hyphen, hyphen, and others) as words when en-dash ...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: x86-64 (AMD64) All
: medium trivial
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Formatting-Mark Word-Count
  Show dependency treegraph
 
Reported: 2019-07-30 17:42 UTC by steve.sottong
Modified: 2025-06-26 03:11 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Shows example of a dash that is not counted as a word and one that is. (8.11 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2019-07-30 17:43 UTC, steve.sottong
Details

Note You need to log in before you can comment on or make changes to this bug.
Description steve.sottong 2019-07-30 17:42:01 UTC
Description:
I found when checking word count in a long document that Writer always was 10 words longer. I finally traced it to Writer counting some dashes as words. Neither MS Word nor Softmaker Textmaker reads these as words in their count. I can provide a document that demonstrates the difference, but it doesn't reproduce in an online form.

Steps to Reproduce:
1.Not sure how the dashes that are counted were made.
2.
3.

Actual Results:
Some dashes are counted as words

Expected Results:
The count should have ignored the dashes.


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 steve.sottong 2019-07-30 17:43:41 UTC
Created attachment 153059 [details]
Shows example of a dash that is not counted as a word and one that is.
Comment 2 V Stuart Foote 2019-07-30 20:50:53 UTC
In OOXML the run is "<w:t xml:space="preserve">Earth </w:t><w:softHyphen/><w:t>– not</w:t></w:r>" 

Which on filter import to Writer gives a text run of U+0020 U+00AD U+2013 U+0020

So, seems the filter assigned U+00AD (SOFT HYPHEN) in combination with the (EN DASH) and bounded by spaces is treated as an edit engine word, increasing the word count.
Comment 3 QA Administrators 2021-08-07 03:40:06 UTC Comment hidden (obsolete)
Comment 4 Diana Vides 2023-05-25 01:55:11 UTC
I was able to reproduce this bug first in version 6.4.7.2. When using a short dash is counted as a word but when using a long dash (autocorrected) is not counted as a word.
Steps to Reproduce:
1. Type a dash and add space and type a word and press enter
2. Type a word add space and type a dash and type a word and add space


Actual Results:
The short dash in Step 1 is counted as a word and the long(autocorrected)dash in Step 2 is not counted as a word.

Expected Results:
Both short dash and long dash should be counted or ignored depending on the specifications. The user guide is ambiguous. 
https://help.libreoffice.org/7.2/en-US/text/swriter/guide/words_count.html?&DbPAR=WRITER&System=WIN


Version: 6.4.7.2 (x64)
Build ID: 639b8ac485750d5696d7590a72ef1b496725cfb5
CPU threads: 6; OS: Windows 10.0 Build 19045; UI render: default; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: CL

I reproduced it in version 7.5.2.2 and it is still present 

Version: 7.5.2.2 (X86_64) / LibreOffice Community
Build ID: 53bb9681a964705cf672590721dbc85eb4d0c3a2
CPU threads: 6; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

I reproduced it in the master version  7.6.0.0 and it is still present 

Version: 7.6.0.0.alpha1+ (X86_64) / LibreOffice Community
Build ID: f4c24da1e7f11664e0d2f688d2531f068e4a3bc0
CPU threads: 6; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: CL threaded
Comment 5 Stéphane Guillou (stragu) 2023-06-26 16:57:54 UTC
I checked in OOo 3.3, it was already the case for a simple hyphen and a soft hyphen surrounded by spaces (although the en-dash was also counted back then).

Related issue looking at the documentation is bug 62799.

Testing in 24.2 alpha0+:

Not counted

En – dash: not counted (U+2013)
Em — dash: not counted (U+2014)

Counted

Horizontal ― bar: counted (U+2015)
Figure ‒ dash: counted (U+2012)
Hyphen - minus: counted (U+002D)
Minus − sign: counted (U+2212)
Hyphen ‐ hyphen: counted (U+2010)
Soft ­ hyphen: counted (U+00AD)

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 9fc0b2b9b96d87eb642a3b29e9dcb5d6273265eb
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded
Comment 6 QA Administrators 2025-06-26 03:11:29 UTC
Dear steve.sottong,

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug