Bug 52504 - Regular Expression Search for circumflex by itself does not match anything
Summary: Regular Expression Search for circumflex by itself does not match anything
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.3 release
Hardware: All Linux (All)
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Find-Search
  Show dependency treegraph
 
Reported: 2012-07-25 21:05 UTC by Jim Avera
Modified: 2017-06-25 11:37 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Avera 2012-07-25 21:05:07 UTC
What is expected to happen in Writer, Calc, or the Macro Editor is when one opens the Find & Replace window, with Regular Expression checkbox checked, in the Search for drop down put a circumflex in, and the beginning of every paragraph is found. Consulting the LO Wiki and built-in LO help, it is implied that using a circumflex by itself in the find field should match the beginning of a paragraph:
http://help.libreoffice.org/Common/List_of_Regular_Expressions

What happens instead is nothing is found.

NOTE: A dollarsign ($) by itself *does* work as expected, i.e., it matches the end of each line.
Comment 1 Cor Nouws 2012-10-06 20:51:47 UTC
Hi Jim,

Pls use    "^."  (without the quotes) to find the first character of a paragraph.
I think the ^ only is used in combinations.
See some examples/explanation in the help .

Regards,
Cor
Comment 2 Jim Avera 2012-10-07 07:56:53 UTC
No.   ^. is not equivalent.  ^. means to match the first character on the line, and if doing a replace then the first character would be deleted.  ^ by itself matches the start of the line (not including any characters), and replacing it with something effectively inserts the "replacement" text at the start of the line.    You could use something ugly like replacing ^(.) with ${1}PREFIX to avoid deleting the first character, but that would fail on blank lines which don't have any characters in them.


In any case, ^ (by itslef) is a standard, well-defined regular expression syntax used everywhere else (Perl, Python, vim etc. etc.) and Libre Office should not do something incompatible.
Comment 3 Jim Avera 2012-10-07 08:01:57 UTC
If you are unsure how regular expression syntax should work (in industry-wide practice), there are many books and online references, for example

http://en.wikipedia.org/wiki/Regular_expression#POSIX_Basic_Regular_Expressions
Comment 4 Cor Nouws 2012-10-07 22:20:08 UTC
Hi Jim,

OK, sorry & thanks for explanantion. (In the mena time I understood that the same applies for $, that cannot be used on itself to find the end of a paragraph).
Did it ever work as is expected, or is it something that has to be implemented..
In that case, this would be an enhancement...
Comment 5 Jim Avera 2012-10-09 00:37:57 UTC
AFAIK ^ has never worked correctly.  I doubt anyone intentionally made Open Office regular expressions incompatible with industry practice, so I think this is a bug, not a missing feature.
 
-Jim
Comment 6 Jim Avera 2012-10-09 00:58:11 UTC
Incidentally $ does match the end of paragraphs (as documented), but seems to match the paragraph break (not just tne -position- at the end of the paragraph), so paragraphs are merged forming a single new paragraph.  Except only one of a group of successive empty paragraphs is matched.

Matching the para-break itself seems odd to me (as usually unhelpful), but might be intentional.  However the fact that only some empty paragraphs are matched is almost certainly a bug.

EXAMPLE: In the following 1-line paragraphs, there are two empty paras between b and c (<P> indicates the paragraph symbol which is shown when displaying non-printing characters):
a<P>
b<P>
<P>
<P>
c<P>
Find-and-replace of $ with X replaces the 5 paragraphs with 2 paragraphs:
aXbX<P>
Xc<P>
As you can see, the 5 paragraphs were collapsed into two paragraphs, except the "paragraph break" was not removed for one of the empty paragraps.
Comment 7 Jim Avera 2014-05-15 07:38:55 UTC
Any thoughts about fixing this?   It's still a problem in 4.3-alpha1

Note that searching for ^. is not a work-around because it will not match the start of empty paragraphs (the "." does not match).   So if you want to prepend something to every paragraph in a selection which includes empty paragraphs, then ^ alone is necessary.
Comment 8 Cor Nouws 2014-06-22 13:34:30 UTC
Isn't your case just covered by using
 & in search and
 \nFOO in replace?

For me that works in Writer
Comment 9 Jim Avera 2014-06-22 22:47:25 UTC
> Isn't your case just covered by using
> & in search and
> \nFOO in replace?

Maybe that was a typo.  The above does not work (does nothing--not matched).
Can you suggest a work-around which inserts some text at the start of every line in Calc's Basic macro editor (including empty lines)?   That's the problem this bug was originally about and which *should* be easy by replacing ^ with the desired text.  That is standard regex behavior everywhere else in computerdom.

^ on its own should work (just like $ on its own does).
Comment 10 Cor Nouws 2014-06-29 18:58:35 UTC
(In reply to comment #9)

> Can you suggest a work-around which inserts some text at the start of every
> line in Calc's Basic macro editor (including empty lines)? 

The component of this issue is Writer .. ?
Comment 11 Jim Avera 2014-06-30 15:56:09 UTC
Not sure where the regex code is.   It manifests in writer and and ing Basic macro editor in Calc.
Comment 12 Cor Nouws 2014-11-04 20:53:06 UTC
Still a problem in 4.4.0alpha1
 > New
Comment 13 Jim Avera 2014-11-05 16:52:58 UTC
Maybe Component should be changed to Spreadsheet, because the problem is more simply visible when editing Basic macro code.  It is common to want to insert spaces at the start of every line in a range (e.g. to "indent" the code one level), and replacing ^ with spaces does not work.
Comment 14 Gordo 2015-06-07 20:08:52 UTC
To add text to the beginning of every paragraph you can do it in two passes.  The first finds the start of the paragraph and the first character and replaces it with whatever text and the first character.  The second pass will find empty paragraphs and replace it with whatever text and a paragraph break.

Search For:    ^.
Replace With:  <text>&

Search For:    ^$
Replace With:  <text>\n

I don't know if LO has anything for the start of a line whether it is the beginning of a paragraph or a line that has been word-wrapped .

Windows Vista 64
Version: 4.4.3.2
Build ID: 88805f81e9fe61362df02b9941de8e38a9b5fd16
Comment 15 Jim Avera 2015-06-07 22:11:36 UTC
> Search For:    ^.   etc.

No, that does not work as explained in comment#2 (empty lines have nothing for the "." to match). 

The ^ alone is supposed to match the start (but doesn't).
Comment 16 Gordo 2015-06-07 22:57:45 UTC
You have four paragraphs like this:
this works

this works
this works

First, run this:
Search For:    ^.
Replace With:  yes &

Then, run this:
Search For:    ^$
Replace With:  this also works\n

The result looks like this:
yes this works
this also works
yes this works
yes this works
Comment 17 Jim Avera 2015-06-07 23:08:05 UTC
Ok, I see what you are doing.  That two-step procedure will work (but should not be needed).

Thanks for pointing it out.