Bug 91336 - regular expressions list in Help for asterisk or question mark and "zero or"
Summary: regular expressions list in Help for asterisk or question mark and "zero or"
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
4.2.8.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium trivial
Assignee: Not Assigned
URL:
Whiteboard: target:24.8.0
Keywords:
Depends on:
Blocks: Help-Changes-Features
  Show dependency treegraph
 
Reported: 2015-05-16 19:40 UTC by Nick Levinson
Modified: 2024-03-11 09:21 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Levinson 2015-05-16 19:40:48 UTC
In the Help file, in the list of regular expressions, this appears: "* Finds zero or more of the characters in front of the '*'. For example, 'Ab*c' finds 'Ac', 'Abc', 'Abbc', 'Abbbc', and so on." That, because it says "zero or more", for the example, would also find "c", thus it would find everything. But that's not the actual behavior when I tested.

Similarly, it says, "? Finds zero or one of the characters in front of the '?'. For example, 'Texts?' finds 'Text' and 'Texts' and 'x(ab|c)?y' finds 'xy', 'xaby', or 'xcy'." That means that "Texts?" will find all instances of "?".

Clearer wordings are needed.
Comment 1 Buovjaga 2015-05-18 15:48:29 UTC
(In reply to Nick Levinson from comment #0)
> In the Help file, in the list of regular expressions, this appears: "* Finds
> zero or more of the characters in front of the '*'. For example, 'Ab*c'
> finds 'Ac', 'Abc', 'Abbc', 'Abbbc', and so on." That, because it says "zero
> or more", for the example, would also find "c", thus it would find
> everything. But that's not the actual behavior when I tested.
> 
> Similarly, it says, "? Finds zero or one of the characters in front of the
> '?'. For example, 'Texts?' finds 'Text' and 'Texts' and 'x(ab|c)?y' finds
> 'xy', 'xaby', or 'xcy'." That means that "Texts?" will find all instances of
> "?".
> 
> Clearer wordings are needed.

* and ? are not treated like characters, but like conditions.

Escaping \? would find the character '?'.

Also, the 'c' when it appears in the expression is treated like part of the condition to "look at what character is allowed to appear immediately before 'c'".

What is your proposal for the revised wording?

Disclosure: I'm not particularly skilled with regexes and usually have to stare at them with my eyes glazed over for long periods of time to understand anything.
Comment 2 Nick Levinson 2015-05-23 16:57:57 UTC
That's exactly why clarification would help: if you know something about it and you still have to stare at it to figure out what it means, ordinary public users would be totally lost. The Help says this about regex: "Allows you to use wildcards in your search." Geeks don't need much, but a friendlier treatment is needed for ordinary Earthlings. Here's a first draft: "Regexes are for complex searches that can't be done in simpler ways. They include wildcards (characters you use when you're not sure exactly which characters you need to find), ways of searching for characters that can't be typed directly (like tabs and paragraph endings), and characters in certain positions (like at a line beginning)." Maybe someone else can improve on that draft.

I know the asterisk and question mark are not literal search terms unless escaped, but the phrasing in Help was not quite what was happening in applying regex. The "c" is also about phrasing in Help. When I recently went back to the Help list, I discovered that the two wildcards are not used quite as some other software uses them, and that might add to the potential confusion.

I'm not a great expert on regex and would prefer that someone else clarify the Help.
Comment 3 Buovjaga 2015-05-27 14:05:00 UTC
(In reply to Nick Levinson from comment #2)
> That's exactly why clarification would help: if you know something about it
> and you still have to stare at it to figure out what it means, ordinary
> public users would be totally lost.

I was referring to using regex in any editor or programming language and I believe this blank staring is the effect they have on all, who don't use them regularly (pun intended) or never invested a substantial amount of time to learning them :)

So the target group is certainly not ordinary users, but power users.

I'll set to NEW in any case and change component to Documentation.
Comment 4 QA Administrators 2016-09-20 09:46:28 UTC Comment hidden (noise, obsolete)
Comment 5 Commit Notification 2024-03-11 08:59:41 UTC
Pierre F committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/969dc658744688aa9e1dc64611a327399d078462

less tautological and more reader friendly wording (tdf#91336)
Comment 6 fpy 2024-03-11 09:20:44 UTC
Note: The list document was already fixed, refering now to "regular expression term immediately preceding"
There's also a related otpic linked as "Using Regular Expressions in Text Searches"