132870 – BASIC-Editor: Matching start-of-line using a lone ^ does not work

Bug 132870 - BASIC-Editor: Matching start-of-line using a lone ^ does not work

Summary: BASIC-Editor: Matching start-of-line using a lone ^ does not work

Status:	RESOLVED DUPLICATE of bug 135538

Alias:	None

Product:	LibreOffice
Classification:	Unclassified
Component:	BASIC (show other bugs)
Version: (earliest affected)	unspecified
Hardware:	All All

Importance:	medium normal
Assignee:	Not Assigned

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	BASIC-IDE
	Show dependency tree / graph

Reported:	2020-05-09 10:07 UTC by Andreas Heinisch
Modified:	2020-08-31 13:00 UTC (History)
CC List:	2 users (show)

See Also:	64690 52504
Crash report or crash signature:

Attachments
Demo script - see Steps To Reproduce (8.17 KB, application/vnd.oasis.opendocument.spreadsheet) 2020-05-09 20:23 UTC, Jim Avera	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andreas Heinisch 2020-05-09 10:07:02 UTC

Description:
Matching start-of-line using a lone ^ does not work, or not correctly.  It seems impossible to prepend anything to a group of lines, for example, to make them into a block comment by prepending "'  " in Basic code.

Maybe now that someone who knows how Find & Replace works is here, this can be looked at.

The problem is that ^ by itself does not match unless the line contains something after the start-of-line position (i.e. it is not completely empty).

1. Open the ReplaceBug.ods demo and navigate to the Basic code as described in the original comment.

2. Select lines 3-5, i.e., the empty sub declaration ("Sub Main", the empty line, and "End Sub")

3. Control-H.
   Check "Regular expressions" 
   Set Find: to the single character ^  (should match starts of lines)
   Set Replace: to the character '  (the Basic comment-starter char)
   Click Replace All

RESULTS: The apostrophie is prepended to only the non-empty lines.
EXPECTED RESULTS: Should be prepended to every line.

Actual Results:
 

Expected Results:
 


Reproducible: Always


User Profile Reset: No



Additional Info:

Comment 1 Jim Avera 2020-05-09 20:23:05 UTC

Created attachment 160565 [details]
Demo script - see Steps To Reproduce

Comment 2 Jim Avera 2020-05-09 20:24:40 UTC

In master (7.0 alpha), the behavior changed, but is still not correct.

Now ^ seems to match only completely-empty lines; nothing is inserted in lines which contain text.

Comment 3 Andreas Heinisch 2020-05-10 08:07:59 UTC

Confirmed in
Version: 7.0.0.0.alpha0+ (x64)
Build ID: dbd74393fd0b4d11655e2c4d2676ec1bfebe8923
CPU threads: 6; OS: Windows 10.0 Build 17134; UI render: Skia/Vulkan; VCL: win; 
Locale: de-DE (de_DE); UI-Language: en-US
Calc: CL

Comment 4 Andreas Heinisch 2020-05-11 16:57:08 UTC

I think the regression was introduced by https://bz.apache.org/ooo/show_bug.cgi?id=118887, but I think this behaviour is not correct.

Consider the String "5a". If I search using the regex "[0-9]*", I should get 3 matches: 
- 0 to 1, which is 5
- 1 to 1 and 2 to 2, which is empty

However, if I search using "[0-9]+", I get the expected match, so imho we should allow zero length matches at the beginning of a paragraph.

Comment 5 Jim Avera 2020-07-31 22:16:35 UTC

> Consider the String "5a". If I search using the regex "[0-9]*", I should get 3 > matches: 
> - 0 to 1, which is 5
> - 1 to 1 and 2 to 2, which is empty

Could you clarify what you meant by "1 to 1" and "2 to 2"?  

The regex [0-9]* should always match anything (or nothing) because it means "zero or more digits".  It will match as many digits as it can while still allowing subsequent regex terms to match (in this example, there are no other terms).

At each step during matching, the current regex term must succeed at the current position of the input; if it does, then the next regex term is tried etc. until all terms match.  If a regex term fails, then the engine has to BACKTRACK and try a different choice in a previous regex term, if any (if there is no previous term or all choices have been tried, then the overall match fails).

In this case, since the regex can succeed matching zero characters, it will (or should) match at any position of any input string, *even an empty input string*.

Comment 6 Michael Warner 2020-08-31 13:00:58 UTC


*** This bug has been marked as a duplicate of bug 135538 ***