Bug 113975 - Expand the documentation for regular expressions
Summary: Expand the documentation for regular expressions
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium normal
Assignee: fpy
URL:
Whiteboard: target:25.2.0 target:24.8.0.0.beta2
Keywords:
Depends on:
Blocks: Help-Changes-Features
  Show dependency treegraph
 
Reported: 2017-11-21 16:42 UTC by Dan Dascalescu
Modified: 2024-08-03 05:20 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Dascalescu 2017-11-21 16:42:52 UTC
Description:
https://help.libreoffice.org/Common/List_of_Regular_Expressions doesn't say anything about regexp lookaround patterns (see https://www.regular-expressions.info/lookaround.html).

Steps to Reproduce:
I'd love to add this information but https://help.libreoffice.org/Main_Page says there's no way to participate.

Actual Results:  
Lookaround patterns do in fact work

Expected Results:
=SEARCH("(?<=a)b", "cab")  // 3


Reproducible: Always


User Profile Reset: No



Additional Info:


User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36
Comment 1 Olivier Hallot 2017-11-21 17:22:27 UTC
help.libreoffice.org is not editable because it is a transformation of the local help system written in XML.

However, if you are willing to write about the issue, please just write in a LibreOffice Writer document, and attach to this bug. I'll review and convert it in XML for the help system, with your authorship.

Please be factual and concise, your contents will be translated into >60 languages.
Comment 2 V Stuart Foote 2017-11-23 06:59:47 UTC
Actually, LibreOffice's RegEx support is via ICU libraries, our documentation comes from there including coverage of ICU RegEx Operators to lookahead and lookbehind and their negatives--the Lookarounds

The Help now links to the Wiki https://wiki.documentfoundation.org/Documentation/HowTo/Writer/Regular_Expressions  or to https://wiki.documentfoundation.org/Documentation/HowTo/Calc/Regular_Expressions both of which correctly defer to the ICU content [1].

It is a little thin, but authoritative, and should be basis for addition to our Help and Documentation--with some practical examples included, preferably on our Wiki. The OOo era content at AOO should probably go away.


=-ref-=
[1] http://userguide.icu-project.org/strings/regexp#TOC-Regular-Expression-Operators
Comment 3 gmolleda 2024-02-28 12:07:24 UTC
Also missing is \d as a synonym for [:digit:], \s for [:space:] and \w for [:word:]
Comment 4 V Stuart Foote 2024-02-28 12:40:52 UTC
(In reply to gmolleda from comment #3)
> Also missing is \d as a synonym for [:digit:], \s for [:space:] and \w for
> [:word:]

Those are fully described in the linked ICU libs documentation:

https://unicode-org.github.io/icu/userguide/strings/regexp.html#regular-expression-metacharacters

But the unmaintained OOo era Wiki content needs to be replaced.
Comment 5 fpy 2024-06-23 15:49:56 UTC
for reference, the URL List_of_Regular_Expressions in first report is not valid anymore,
we shall assume it's now https://help.libreoffice.org/latest/en-US/text/shared/01/02100001.html
Comment 6 V Stuart Foote 2024-06-23 19:06:21 UTC
(In reply to fpy from comment #5)
> for reference, the URL List_of_Regular_Expressions in first report is not
> valid anymore,
> we shall assume it's now
> https://help.libreoffice.org/latest/en-US/text/shared/01/02100001.html

Yep, that looks correct. Thanks!

While the ICU libs project provided UG listing at:
https://unicode-org.github.io/icu/userguide/strings/regexp.html#regular-expression-metacharacters remains definitive.
Comment 7 Commit Notification 2024-06-24 08:53:28 UTC
Pierre F committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/ade7397aed9febc73918d93ffdf6477d3f4d4175

explicit \s and \d +  put the note first for full ICU spec. tdf#113975
Comment 8 Buovjaga 2024-07-01 13:01:32 UTC
Thanks for the commit. Can this be closed as fixed?
Comment 9 Commit Notification 2024-07-04 14:15:58 UTC
Pierre F committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/help/commit/92a38c452c2e71463a3331431fbc24b6e3d3c6c0

explicit \s and \d +  put the note first for full ICU spec. tdf#113975