Bug Hunting Session
Bug 123000 - Regular expression match function, please?
Summary: Regular expression match function, please?
Status: RESOLVED DUPLICATE of bug 113977
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: All All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Cell-Formula
  Show dependency treegraph
 
Reported: 2019-01-27 19:45 UTC by TBeholder
Modified: 2019-04-16 09:53 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description TBeholder 2019-01-27 19:45:15 UTC
Description:
Since regexp are already supported well enough to use in functions like COUNTIFS, why plain regexp match couldn't be used as a function usable in a formula?
This would make extraction of a desirable substring simple in most situations (even with positive/negative lookahead/lookbehind regexps), without either requiring more custom functions or unnecessarily complication.

Steps to Reproduce:
I want to parse a value out of text — e.g. the text in column $I may contain "SomeTrait[123]" and if so, my goal is to extract value 123. I need a formula to  fill a long column.

Actual Results:
I'll end up with something like
=IFERROR( FIND("SomeTrait[",$I2),0)
in one column ($Z), and then
=IF($Z2>0,VALUE( MID($I2,$Z2+LEN("SomeTrait["),FIND("]",$I2, $Z2)-LEN("SomeTrait[")-$Z2) ),0)
in another, and still have to pre- sanitize text so it won't fail due to containing "Trait [5]" instead, or make it even more complicated.
Every time a string needs parsing, this involves unnecessarily complex constructs made of LEFT()/MID()/RIGHT(), FIND/SEARCH() and LEN() that have to either be repeated with the same arguments or consume half a dozen columns as temporary variables, just to parse this one value (and there are more of them).

Expected Results:
=VALUE(REGEXPMATCH($I2, "(?<=SomeTrait\[).*(?=\])"))
matches "123". Just find the right cell, extract desired text from it, process it. Two functions, fairly transparent… regexps may be not always quite obvious, but at least short enough to analyze as a whole, and often easy to manually test (if something works in search, it will work in a formula too).


Reproducible: Always


User Profile Reset: No



Additional Info:
Comment 1 Xisco Faulí 2019-04-16 09:16:18 UTC
Hello Winfried,
Any opinion here ?
Comment 2 Winfried Donkers (retired) 2019-04-16 09:53:38 UTC
(In reply to Xisco Faulí from comment #1)
> Hello Winfried,
> Any opinion here ?

Starting with LO version 6.2.0 Calc has the function REGEX, which matches text against a regular expression and optionally replaces text.
Therefore I set this bug report to resolved/duplicate.

*** This bug has been marked as a duplicate of bug 113977 ***