164006 – Hyphenator service: createPossibleHyphens may create extra zero elements in getHyphenationPositions

Bug 164006 - Hyphenator service: createPossibleHyphens may create extra zero elements in getHyphenationPositions

Summary: Hyphenator service: createPossibleHyphens may create extra zero elements in g...

Status:	RESOLVED FIXED

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	sdk (show other bugs)
Version: (earliest affected)	unspecified
Hardware:	All All

Importance:	medium normal
Assignee:	Mike Kaganski

URL:
Whiteboard:	target:25.2.0 target:24.8.4
Keywords:

Depends on:
Blocks:

Reported:	2024-11-23 04:52 UTC by Mike Kaganski
Modified:	2024-11-25 11:36 UTC (History)
CC List:	0 users

See Also:
Crash report or crash signature:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Mike Kaganski 2024-11-23 04:52:26 UTC

Consider this Basic code:

sub testHyphenator
 hyphenator = createUnoService("com.sun.star.linguistic2.Hyphenator")
 locale = new com.sun.star.lang.Locale
 locale.Language = "ru"
 locale.Country = "RU"
 if not hyphenator.hasLocale(locale) then MsgBox("Russian hyphenator must be installed")
 result = hyphenator.createPossibleHyphens("Переносимое", locale, array())
 MsgBox Join (result.getHyphenationPositions(), ", ")
end sub

Make sure that the bundled Russian dictionaries (including hyphenator) are installed. Run the code.

It shows "3, 5, 7, 0", and the fourth element is unexpected. It must be "3, 5, 7", in accordance to the documentation [1], telling:

> sequence<short> getHyphenationPositions( )
> Returns
>    an ascending sequence of numbers where each number is an offset within the
>    original word which denotes a hyphenation position ...

[1] https://api.libreoffice.org/docs/idl/ref/interfacecom_1_1sun_1_1star_1_1linguistic2_1_1XPossibleHyphens.html#a8f81658ad635eb427dbd6182b792866a

Comment 1 Mike Kaganski 2024-11-23 06:17:56 UTC

https://gerrit.libreoffice.org/c/core/+/177077

Comment 2 Commit Notification 2024-11-23 09:03:57 UTC

Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9c14ec81b6c25c7932964382f306dadfefeda518

tdf#164006: Only use original word's positions, ignore extra encoded length

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.

Comment 3 Commit Notification 2024-11-25 11:36:27 UTC

Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/d2905f1a72cc77c96bbcde26022fe39ba96b7a6c

tdf#164006: Only use original word's positions, ignore extra encoded length

It will be available in 24.8.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.