Bug 155839 - When propmpted about missing hyphenation data - I should be offered a download/package install link
Summary: When propmpted about missing hyphenation data - I should be offered a downloa...
Status: RESOLVED DUPLICATE of bug 136084
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.6.0.0 alpha1+
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Languages Hyphenation
  Show dependency treegraph
 
Reported: 2023-06-14 19:12 UTC by Eyal Rozenberg
Modified: 2023-11-13 17:42 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2023-06-14 19:12:04 UTC
I just reopened my LibreOffice nightly build after... well, actually, I'm not sure exactly after what, but something triggered a yellow notification bar when I opened a document (He+En+Fa content); and the bar said that Hyphenation patters for HE-IL were missing.

Ok, I might have expected to be offered this earlier, but - fine, I'm game, let's download those. So, I click the button - but it's not a download link; nor is it a button to a download page. It's a link to a TDF Wiki page - a wiki page for _all_ language at once, and with a large number of links, some in a table, some not in a table; plus a humongous navigation bar on top of the page. 

This is already a bug.

Now, it's doubly a bug, because no hyphenation patterns are even _offered_ for Hebrew, so LO is just leading the user on a wild goose chase for now.
Comment 1 Mike Kaganski 2023-06-14 20:14:56 UTC
No.

LibreOffice only tells you the fact.
It has no way to know that *any* language's hyphenation data is available from *any* source (third party extension? maybe on OOo/AOO extension site?); or maybe in another product (MS Word that can generate ODT?).

It knows that there is a text marked this language (it may be even not known to LibreOffice, and its name is obtained from the library used internally, given the language tag that some user entered manually), and this text is also has "auto-hyphenation" property. And it can't do that. And tells. Nothing wrong at all.

And - well, unless you require that all language modules are installed unconditionally (which I personally would welcome, but many would hate), there's no way to even point to downloads, because it may be not downloadable, but installable - from the MSI on Windows; from packages on Debian...

NOTABUG.
Comment 2 Eyal Rozenberg 2023-06-14 21:43:24 UTC
(In reply to Mike Kaganski from comment #1)
> LibreOffice only tells you the fact.

Notification bars are not about telling me facts; they're about drawing my attention to a problematic situation which I may/should want to address.

> It has no way to know that *any* language's hyphenation data is available
> from *any* source (third party extension? maybe on OOo/AOO extension site?);
> or maybe in another product (MS Word that can generate ODT?).

Sure it has. If LO can know when an update is available, it can know whether hyphenation patterns are available.

But even if you were to argue it shouldn't actively go look for the hyphenation patterns for the locale(s) you use - it should still do so if you ask it to, i.e. it will tell you you're missing patterns, and when you click, it will either get them for you, or tell you they're missing. Not as great, but better than what we have now.

> And it can't do that. And tells. Nothing wrong at all.

If this were an _error_ message, that would be a different matter. But it isn't. It leads the user to believe that they can get those hyphenation patterns.

> And - well, unless you require that all language modules are installed
> unconditionally (which I personally would welcome, but many would hate),
> there's no way to even point to downloads, because it may be not
> downloadable, but installable - from the MSI on Windows; from packages on
> Debian...

I'll again make the analogy regarding LO updates. Yes, this may be difficult/impossible/distribution-specific. In those case, we should offer an easy customization point for distributors, plus, we should make some effort to help the user get what they need. That is, depending on how LO was installed and/or what we know about the OS/DE, we should point the user to the closest thing possible to a download link. If it's one of our packages - then it would be to a corresponding package of the same kind for hyphenation patterns; if not - the packagers will change this to something which suits them.

Anyway, let's see what the UX design, umm, ad-hoc per-meeting committee thinks about this.
Comment 3 Mike Kaganski 2023-06-15 06:32:53 UTC
(In reply to Eyal Rozenberg from comment #2)
> (In reply to Mike Kaganski from comment #1)
> > LibreOffice only tells you the fact.
> 
> Notification bars are not about telling me facts; they're about drawing my
> attention to a problematic situation which I may/should want to address.

Let me skip the obvious that the two things are not mutually exclusive; and that telling facts is necessary part of drawing attention to anything.

But here you definitely have *something* that your attention needs to be drawn to. Again: the *document* demands that *a piece of it* be auto-hyphenated using rules of a specific language. LibreOffice may happen to know these rules, or if not, it may draw your attention to the fact that it simply can't layout the text as the document specifies (so you likely see the document not the way its author intended) - and this is a problem that LibreOffice can't handle itself; but user *might* know a way. Or not. One of the things users can do is asking the author. Another is googling for the hyphenator. Or volunteering and creating one. Lots of options.

> 
> > It has no way to know that *any* language's hyphenation data is available
> > from *any* source (third party extension? maybe on OOo/AOO extension site?);
> > or maybe in another product (MS Word that can generate ODT?).
> 
> Sure it has. If LO can know when an update is available, it can know whether
> hyphenation patterns are available.

No. This is where one person is creating a new Hunspell dictionary (for Osetian): https://forumooo.ru/index.php?topic=9244.0, https://forumooo.ru/index.php?topic=9771.0. It is not a hyphenation dictionary, but let as loo forward, at the moment they start implementing hyphenator, too. They have it locally, in some WIP form. They (and anyone) can create documents specifying this language for (portions of) text. How could LibreOffice on your system know where to obtain that custom dictionary? There might be some *narrow* set of cases where LibreOffice could know that *its own* distribution includes these - but again: it needs to know how to install these: on Windows, it would need admin to initiate modification of existing installation; on Linux, each distro would have own command and own package name for those... But *in general*, the problem of knowing how to obtain a random fo-BAR language that user tagged their hyphenated text with is unsolvable.

> But even if you were to argue it shouldn't actively go look for the
> hyphenation patterns for the locale(s) you use - it should still do so if
> you ask it to, i.e. it will tell you you're missing patterns, and when you
> click, it will either get them for you, or tell you they're missing. Not as
> great, but better than what we have now.

It should do *what exactly*? Go googling? Please wait and think about what I try to explain.

> > And it can't do that. And tells. Nothing wrong at all.
> 
> If this were an _error_ message, that would be a different matter. But it
> isn't. It leads the user to believe that they can get those hyphenation
> patterns.

It is a warning that the document looks differently. It is not an error (like "the file is corrupt"), but still something needing to "draw user's attention to".

But improvements at least in "built-in" cases would indeed be welcome. However, in case of Hebrew, which you put into the description (comment 0), where you tell LibreOffice has nothing in itself, the point stands, that it can do nothing.
Comment 4 Mike Kaganski 2023-06-15 06:38:19 UTC
And indeed, wording improvements, and - in case of built-in subset - clear instructions how to add missing packages - are really welcome.
Comment 5 Mike Kaganski 2023-06-15 07:18:47 UTC
cloph, rene: do you know if it's possible to tweak our packaging process such as to allow the main package to know which language packages (and their parts - like, if they contain hyphenation) are built together? Of maybe we can allow to set up a URL to learn it?
Comment 6 Heiko Tietze 2023-06-16 08:51:28 UTC
Rather than forwarding to a hopefully up-to-date wiki we could show the list of extension directly (same as tools > options > language > writing aids: Get more dictionaries online...).

It's a different bug that Hebrew is not listed here but available at https://extensions.libreoffice.org/en/extensions/show/hebrew-he-spell-check-dictionary
Comment 7 Heiko Tietze 2023-06-16 08:56:15 UTC
Btw, work on this infobar was done for bug 128191 and bug 131233, with a similar idea in c16.
Comment 8 Mike Kaganski 2023-06-16 09:42:45 UTC
(In reply to Heiko Tietze from comment #6)
> same as tools > options > language > writing aids: Get more dictionaries online...

(1) (unrelated) somehow, this (and same link in EditModules) does not work for me in Version: 7.5.4.2 (X86_64) / LibreOffice Community
Build ID: 36ccfdc35048b057fd9854c757a8b67ec53977b6
CPU threads: 12; OS: Windows 10.0 Build 19045; UI render: Skia/Raster; VCL: win
Locale: ru-RU (ru_RU); UI: en-US
Calc: CL threaded

(2) I generally dislike, that the only thing in *UI* points to the *extensions* site ( https://extensions.libreoffice.org/dictionaries/ ), where there are lots of *unmaintained* third-party extensions, while for *most* cases, the distribution itself contains a (not installed) package for the needed language, and it should be the first thing to suggest.

(3) Dictionaries do not necessarily include hyphenation data, and not all of them clearly communicate that. So even if this link is provided, a notice would be nice, telling "look for a respective dictionary extension *containing hyphenation data*".
Comment 9 Eyal Rozenberg 2023-06-16 21:58:44 UTC
(In reply to Mike Kaganski from comment #3)
> There might be some *narrow* set of cases where LibreOffice
> could know that *its own* distribution includes these

This is what I would expect. The thing is, that if an app is telling you "I am missing FOO for locale LOC", your assumption as a user is that FOO is a part of the app, but distributed separately for different locales; and that you have likely neglected to also install FOO for LOC along with LO itself. Third-party extension and such would not make the app present a notification bar.

> - but again: it needs
> to know how to install these: on Windows, it would need admin to initiate
> modification of existing installation; on Linux, each distro would have own
> command and own package name for those... 

Ok, so it would need admin privileges, that's doable. Or it would need per-distribution customization - also doable, with the default suggesting the user seek a package for their distribution. And the version we let people download from our servers will be appropriately customized to download packages from our servers.

> But *in general*, the problem of
> knowing how to obtain a random fo-BAR language that user tagged their
> hyphenated text with is unsolvable.

So, not the general problem, just the problem of obtaining a very specific FOO for LOC. If there are other options for it - that's "not our problem" for the purposes of this notification bar. Or - a secondary button, "Learn More", could mention other options/sources/ideas. i.e. what we offer now is the other possibility for users to pursue, but the default one should just be to download the patterns.

> It is a warning that the document looks differently. It is not an error
> (like "the file is corrupt"), but still something needing to "draw user's
> attention to".

If it said that, it would be one thing. But LO is telling you its missing something, and hinting that it is missing a piece of itself.

(In reply to Heiko Tietze from comment #6)
> Rather than forwarding to a hopefully up-to-date wiki we could show the list
> of extension directly (same as tools > options > language > writing aids:
> Get more dictionaries online...).

I agree with Mike - we should not refer user to extensions, certainly not in this situation; and If LO knows about it beforehand, then it's not an extension.

Also, and even more importantly - if you can find the list, then - don't show any list, rather, on button press, let's just download and install the relevant file. I just don't understand the insistence on not doing that or the claim that we can't.
Comment 10 Mike Kaganski 2023-06-17 08:27:48 UTC
(In reply to Eyal Rozenberg from comment #9)
> So, not the general problem, ...

This "let us only focus on built-in set of hyphenation data" wasn't communicated clearly enough from the start; if it were, then you wouldn't feel like there is

> the insistence on not doing that or the claim that we can't

especially in the presence of comment 4 and comment 5, etc. ;)
Comment 11 Eyal Rozenberg 2023-06-17 08:54:29 UTC
(In reply to Mike Kaganski from comment #10)
> especially in the presence of comment 4 and comment 5, etc. ;)

Fair enough, let's see whether cloph / rene can shed light on what's possible.
Comment 12 Heiko Tietze 2023-06-19 08:54:05 UTC
(In reply to Mike Kaganski from comment #8)
> (1) (unrelated) somehow, this (and same link in EditModules) does not work
> for me in Version: 7.5.4.2 
This should be fixed!

> (2) ...there are lots of *unmaintained* third-party extensions, while for
> *most* cases, the distribution itself contains a (not installed) package for
> the needed language, and it should be the first thing to suggest.
The idea of the extension site is to have the community maintain 3rd party content. Meaning to report packages that are not up to date.

> (3) Dictionaries do not necessarily include hyphenation data...
That's indeed unfortunate but I see no way around. Currently we just add the wiki step in between the procedure.
Comment 13 Heiko Tietze 2023-07-14 10:50:18 UTC
(In reply to Eyal Rozenberg from comment #11)
> Fair enough, let's see whether cloph / rene can shed light on what's
> possible.

=> NEEDINFO
Comment 14 Tex2002ans 2023-11-06 21:09:39 UTC
(In reply to Eyal Rozenberg from comment #0)
> Now, it's doubly a bug, because no hyphenation patterns are even _offered_
> for Hebrew, so LO is just leading the user on a wild goose chase for now.

Hmmm...

Does the Hebrew language even hyphenate when words break across lines?

Or is it like Japanese/Chinese, where hyphenation doesn't even make sense at all:

- https://bugs.documentfoundation.org/show_bug.cgi?id=143422

(In that case, the infobar was disabled for "ja" + "zh-CN".)

(In reply to Eyal Rozenberg from comment #0)
> It's a link to a TDF Wiki page - a wiki page for _all_ language at once,
> and with a large number of links, some in a table, some not in a table;
> plus a humongous navigation bar on top of the page.

Was it this page?

- https://wiki.documentfoundation.org/Language/Support

Sounds like that page could use an overhaul for readability.

Perhaps splitting into multiple, dedicated sections instead.

1. Spellchecking
2. Grammarchecking
- Most can be offloaded to LanguageTool's site.
- The few languages with other tools (like French+Grammalecte) can be listed on their own.
3. Hyphenation
- Hyphenation.org is the ultimate source.
4. UI
- Perhaps this can just be offloaded/pointed to Weblate.
5. Thesaurus
6. (The rest of the table's leftover columns in simple "Y/N/?" form.)

The "Hyphenation" section can then:

- Explain basic instructions for fixing "missing hyphenation info" on Windows/Linux/Mac.

Similar to what I explain when "red squigglies" stop working for people... because they forgot to install Spellchecking Dictionaries:

- https://www.reddit.com/r/libreoffice/comments/15qt8rk/how_to_install_dutch_dictionary/jw6a5ht/
- https://www.reddit.com/r/libreoffice/comments/11h5kqr/why_is_spellcheck_not_working_in_libreoffice/jati80q/
- https://www.reddit.com/r/libreoffice/comments/udu7sm/contortions_for_spelling_in_office_libre/i6l2pay/

It's pretty much the same exact instructions.

(On Linux, it's a "hyphen-xx" package. In Windows, it's an extra checkbox during install.)

> but something triggered a yellow notification bar when I opened
> a document (He+En+Fa content); and the bar said that Hyphenation
> patters for HE-IL were missing.

Like Mike Kaganski said, an information bar might pop up that says:

> Missing hyphenation info. Please install hyphenation package for locale "en-US".
   - (Where "en-US" = the code for your specific language.)

This is triggered when you have BOTH:

1. Hyphenation Dictionaries NOT INSTALLED for that language.

2. A Format/Style in your document which has:

- Automatic "Hyphenation" enabled.

You can see this in:

- Format > Paragraph
- "Text Flow" tab.

or:

- Style > Edit Style (Alt+P)
- "Text Flow" tab.

Under Hyphenation, you'll see a checkbox for:

- Automatically

- - -

For every language... just like you install different:

- UI Languages
- Spellchecking Dictionaries
   - (To get red squigglies.)

there are also:

- Hyphenation Dictionaries

And if you haven't installed them (or they don't exist), then there's not much LO can do besides inform with the bar!

And like Mike Kaganski said, each OS would require slightly different instructions/packages.

This is where the Wiki page could be a good guide!

- - -

(In reply to Mike Kaganski from comment #8)
> (2) I generally dislike, that the only thing in *UI* points to the
> *extensions* site ( https://extensions.libreoffice.org/dictionaries/ ),
> where there are lots of *unmaintained* third-party extensions, while for
> *most* cases, the distribution itself contains a (not installed) package for
> the needed language, and it should be the first thing to suggest.

Yes, full agree.

Almost always, it's because someone forgot to:

- Install the package (Linux)
- Check an optional box during install (Windows / Mac[?])

I would personally rank them like this:

1. LO Wiki
- This can point you to:
--- Windows-/Linux-/Mac-specific instructions
--- Distro-specific packages
- + more resources.
--- Like where to grab the proper .tex file + LO folder.
--- + explain Manual Install.
2. Hyphenation.org
- Which is the ultimate Hyphenation Dictionary resource.
- These are where all the latest patterns for each language are.
3. LO Extensions site
- (Some of these can be linked in the Wiki, but I wouldn't go promoting them directly.)
- Like Mike said, many of these are old/unmaintained.
- Better bet would be to:
--- Install the packages from distro or install latest LO.
--- Grab latest Hyphenation Dictionaries directly from Hyphenation.org.

(In reply to Heiko Tietze from comment #12)
> > (3) Dictionaries do not necessarily include hyphenation data...
> That's indeed unfortunate but I see no way around. Currently we just add the
> wiki step in between the procedure.

Heh, luckily Hyphenation Dictionaries barely ever get touched. So they move MUCH slower than the other files.

New languages occasionally do get added though.

Like in 2021, Czech (cs) hyphenation was created. See:

- TUG 2021: "Czechoslovak Hyphenation Patterns, Word Lists, and Workflow" by Ondřej & Petr Sojka
- https://www.youtube.com/watch?v=kU8-EnMmJ10
Comment 15 Eyal Rozenberg 2023-11-07 11:53:38 UTC
(In reply to Tex2002ans from comment #14)
> Does the Hebrew language even hyphenate when words break across lines?

Yes, that does happen. TBH, Hebrew words are, on average, 

> (In reply to Eyal Rozenberg from comment #0)
> > It's a link to a TDF Wiki page 
> 
> Was it this page?
> 
> - https://wiki.documentfoundation.org/Language/Support

Yes, I think that's the one.

> Perhaps splitting into multiple, dedicated sections instead.

That would only mitigate, rather than resolve, the problem.

> - Explain basic instructions for fixing "missing hyphenation info" on
> Windows/Linux/Mac.

If LibreOffice can fix the lack of hyphenation patterns by downloading them, there's no need for the user to read explanations. It's not like the user asked to learn about hyphenation. But - this could be a second "Read More" button, as I mentioned in an earlier comment.


PS - Cloph, rene - awaiting your input.
Comment 16 Rene Engelhard 2023-11-07 19:35:43 UTC
> PS - Cloph, rene - awaiting your input.

It shouldn't come to a surprise to anyone that LO shouldn't download this. It unnecessarily duplicates stuff (probably) available as packages. (And yes, I am against packagekit etc. installing packages, for that matter, as that's the admin decision.)

As people already said there is hyphen-* (or however called in other distros) and people should install them. If they miss it initially they should look for it later.

Maybe for Windows or Mac, but definitely not for any Linux.
Comment 17 Eyal Rozenberg 2023-11-07 20:01:49 UTC
(In reply to Rene Engelhard from comment #16)

You wrote LO shouldn't download hyphenation patterns, but then wrote that maybe it should download them on Windows and Mac.

If I understand correctly: You oppose downloading the patterns when a more "structured" method of installing them is available. I tend to agree, but that just means we should refine the ask in this bug request: The link should either download & install from the website, as a fallback, or perform a distribution-specific package installation package, when possible. So, if we're on Linux - clicking the link/button would execute something like apt, dnf, zypper or yum to install the package (or a GUI client).
Comment 18 Rene Engelhard 2023-11-07 20:07:24 UTC
> or perform a distribution-specific package installation package, when possible. So, if we're on Linux - clicking the link/button would execute something like apt, dnf, zypper or yum to install the package (or a GUI client).

No. LO is not a package installer. 
That's what I meant with my packagekit part.
Actually I disable LOs already-existing attempts to do that in Debians packages explicitely as default. (cf. https://salsa.debian.org/libreoffice-team/libreoffice/libreoffice/-/blob/master/patches/no-packagekit-per-default.diff)
Comment 19 Eyal Rozenberg 2023-11-07 21:20:25 UTC
(In reply to Rene Engelhard from comment #18)
> > or perform a distribution-specific package installation package, when possible. So, if we're on Linux - clicking the link/button would execute something like apt, dnf, zypper or yum to install the package (or a GUI client).
> 
> No. LO is not a package installer. 
> That's what I meant with my packagekit part.

Oh, I'm sorry, I didn't (and still don't quite) know what packagekit is so I missed it.

So, two points: 

1. You're saying that your position is different than what we do in other cases. Would you say that installing hyphenation patterns or not is a similar choice as those other cases, or is it worse/more disagreeable in your opinion (and if so why)?

2. Suppose I adopt your position and believe that LO shouldn't download/install stuff. If that's the case - what's the justification for these patterns not being part of the LO download (Windows installer / set-of-DEB/RPM packages)?
Comment 20 QA Administrators 2023-11-08 03:15:45 UTC Comment hidden (obsolete)
Comment 21 Rene Engelhard 2023-11-08 05:24:28 UTC
> 1. You're saying that your position is different than what we do in other cases. 

No, I am not. Actually it's consistent with the other cases. That's why I disable this. No install. Be it fonts or hyphenation patterns or whatever. Just that these were added and I needed too patch it out (actually a version of the patch earlier was bigger (before all that was in a config): https://salsa.debian.org/libreoffice-team/libreoffice/libreoffice/-/commit/175015a30228ef08f1c98db513d8fdd80bf6168f

> 2. Suppose I adopt your position and believe that LO shouldn't download/install stuff. If that's the case - what's the justification for these patterns not being part of the LO download (Windows installer / set-of-DEB/RPM packages)?

Thex *are* part of it. TTBOMK it includes hyphenation patterns. If people don't install them.... Just not everything else on the net which may be provided for it.
And actually I believe LO also is not a "let's include everything you find on the net so as people might deem it useful" conglomeration.

(And there is https://packages.debian.org/search?keywords=hyphen- in Debian proper)
Comment 22 Eyal Rozenberg 2023-11-08 07:48:23 UTC
(In reply to Rene Engelhard from comment #21)
> Be it fonts or hyphenation patterns or whatever.

Ok, now I get it. And indeed, we bundle fonts and install them, we don't tell the user "go install this-and-that font".

> > 2. Suppose I adopt your position and believe that LO shouldn't download/
> > install stuff. If that's the case - what's the justification for these 
> > patterns not being part of the LO download (Windows installer / set-of-
> > DEB/RPM packages)?
> 
> They *are* part of it. TTBOMK it includes hyphenation patterns.

Ah, well - it doesn't, or I would not have opened this bug. I didn't perform a partial install or pick-and-choose packages. But perhaps let someone else shed light on whether hyphenation patterns are included or not.
Comment 23 Mike Kaganski 2023-11-08 08:05:03 UTC
(In reply to Eyal Rozenberg from comment #22)
> > > what's the justification for these  patterns not being part of the
> > > LO download (Windows installer / set-of-DEB/RPM packages)?
> > 
> > They *are* part of it. TTBOMK it includes hyphenation patterns.
> 
> Ah, well - it doesn't, or I would not have opened this bug. I didn't perform
> a partial install or pick-and-choose packages. But perhaps let someone else
> shed light on whether hyphenation patterns are included or not.

Hyphenation patterns are definitely included in MSI (installer for Windows) as part of respective dictionary packs, *where they exist* in respective dictionaries in the repo. E.g., see the 'hyph_ru_RU.dic' under https://git.libreoffice.org/dictionaries/+/refs/heads/master/ru_RU/.

OTOH, see that there's no hyphenation data under https://git.libreoffice.org/dictionaries/+/refs/heads/master/he_IL/. And this is the question to the respective native language team - it would be great to include such data there.

So if:
1. The dictionary pack includes the hyphenation data;
2. The dictionary pack is selected when installing
then Windows users will get the hyphenation for that language. I don't know what DEBs/RPMs contain, so can't speak on that.
Comment 24 Buovjaga 2023-11-13 17:42:17 UTC

*** This bug has been marked as a duplicate of bug 136084 ***