Created attachment 168450 [details]
"creatine" is actually a US English word
Not sure if this should be filed against a dictionary component, please re-file accordingly.
What is specifically wrong with the screenshot, and why do you say in the title that "creatine" is detected as a Romanian word? At least the image does not make it obvious.
What I see is that it detects a spelling error on "creatine" written in an unknown language (the status bar, which could tell the language information, has not fit on the screenshot); and that there is a "Word is Romanian" suggestion - again, unclear why, given that there's no OS and LO configuration information provided in the report. I would guess that it simply suggests user's locale, or maybe from the list of installed dictionaries, or somesuch, without any relation to whether it thinks the word is Romanian or not.
And only if it does not underline it when set to Romanian; or if there's a reason to believe that it shows this suggestion exactly because of the guess, and not because there are installed components that it suggests, can we think that the preamble is correct ...
I'm not sure including all amino acid names in the general-purpose English dictionary is a good idea.
I don't know anything about Romanian, but "creatine" is probably indeed a common Romanian word, therefore you see the suggestion. It only appears if you have Romanian dictionary installed (and maybe enabled)?
You can always solve your problem locally by adding "creatine" to your user's dictionary using the "Add to Dictionary" menu item, but you probably already know that.
(In reply to Ming Hua from comment #2)
> It only appears if you have Romanian dictionary installed (and maybe enabled)
I take this back.
I was testing in Writer and didn't see the "Word is Romanian (Romania)" menu item like Dan's screenshot showed. Now that I've tested in Calc, I can see the same menu even if I don't have Romanian dictionary installed.
Version: 220.127.116.11.beta1 (x64)
Build ID: 828a45a14a0b954e0e539f5a9a10ca31c81d8f53
CPU threads: 2; OS: Windows 10.0 Build 18363; UI render: default; VCL: win
Locale: zh-CN (zh_CN); UI: zh-CN
Chinese locale and UI, default western text in Tools > Options > Language Settings > Languages is set to "English (USA)", the text "Cellucor creatine" in a cell is detected as English according to the status bar, yet the context menu when right-clicking on "creatine" still gives "Word is Romanian..." and "Paragraph is Romanian..." items.
Looking into the code, OP seems to have guessed right.
The menu items are created in EditView::ExecuteSpellPopup (editeng/source/editeng/editview.cxx). It uses a language guesser, implemented in lingucomponent/source/languageguessing/guesslang.cxx.
When used for a single word, EditView::CheckLanguage tries four languages:
* The default document language from "Tools/Options - Language Settings - Languages: Western";
* The one from "Tools/Options - Language Settings - Languages: User interface";
* The one from "Tools/Options - Language Settings - Languages: Locale setting";
If they have active dictionaries, then first of them is used further.
When checking paragraph text, the language guesser uses libexttextcat  to perform a "fingerprint-based" guessing. It looks highly unreliable, based on the evidence...
I suppose it is the same as (part of) tdf#66051. Personally I would just drop it.
(In reply to Mike Kaganski from comment #4)
> Personally I would just drop it.
... I mean, just drop the language guesser. I don't see it doing anything useful.
Agree, I would like to disable the language guesser altogether (is there a way to do that?) for the performance gain, because I only use English in my documents (part of an effort to advocate for using English universally, since the costs of translation, globally, exceed those of eliminating hunger, http://bit.ly/translation-vs-world-hunger, but that's a totally separate story).
FWIW, I don't have any locales installed either. I'm coincidentally Romanian and "creatine" is not a Romanian word actually (https://dexonline.ro/definitie/creatine).
I would advocate for including it in the English dictionary because it is more than just another amino acid; it's probably the second most popular supplement in the fitness industry.
See also: "Language Guessing" at https://www.openoffice.org/development/releases/2.3.0.html
I agree the language guesser is not working well but I don't think kicking it is the solution. It is - even though I have nothing Romanian installed on my PC ANYWHERE - suggesting Romanian to me.
In any case, as a simple solution, use Marco Pinto's English dictionary? It's not on the LO extensions site but on the OO one: https://extensions.openoffice.org/en/project/english-dictionaries-apache-openoffice
It's much more up do date than what LO seems to bundle and it certainly passes creatine as an English word for me.
(In reply to Michael Bauer from comment #8)
Marco Pinto is a great LO contributor: https://gerrit.libreoffice.org/q/owner:marcoagpinto%2540sapo.pt
So it's definitely not that "It's much more up do date than what LO seems to bundle" - and the issue of a single word in the dictionary would not solve the underlying issue of "random" guessing of applicable languages based on what is focused (which is what your bug 95274 is about, either).
And anyone is of course welcome to provide contributions to our dictionaries :-) - see https://wiki.documentfoundation.org/Development/Dictionaries