Bug 38261 - Better Find&Replace with regular expressions
Summary: Better Find&Replace with regular expressions
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Documentation (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium enhancement
Assignee: fpy
URL: https://extensions.libreoffice.org/ex...
Whiteboard: target:25.2.0
Keywords:
: 58744 95708 119728 119924 129187 139585 140996 161029 (view as bug list)
Depends on: 45344 108256 128999 138931 163012 34390 37494 37760 43107 47791 102374 161029
Blocks: Find-Search Find&Replace-Regex
  Show dependency treegraph
 
Reported: 2011-06-13 09:41 UTC by Timur
Modified: 2024-09-18 08:34 UTC (History)
22 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timur 2011-06-13 09:41:32 UTC
I'd like to see better Find&Replace in all LO programs. It's rather general description, so I'll add that this should include the current functionality of Alternative Find & Replace extension, which doesn't seem to be maintained, and is only available for Writer. Regular expressions should be included.

Alternative Find & Replace seems also to solve an issue that Find and Replace doesn't take case of original word in to account, http://openoffice.org/bugzilla/show_bug.cgi?id=17188, with an option Preserve capitalisation.
Comment 1 Timur 2011-06-13 09:43:12 UTC Comment hidden (obsolete)
Comment 2 Björn Michaelsen 2011-12-23 12:23:09 UTC Comment hidden (noise)
Comment 3 Timur 2011-12-24 10:02:13 UTC
1. Alternative Find & Replace extension was available from http://extensions.services.openoffice.org/en/project/AltSearch but it's not maintained and it doesn't work anymore.
The is one of the most important extensions to transfer to LO.

2. The same issue that Find and Replace doesn't take case of original word into account is now on a new address: https://issues.apache.org/ooo/show_bug.cgi?id=17188.
For example: if you search case-insensitive for "he" and want to 
replace with "she", all the instances of "he" which are written as "He" will 
be replaced by "she" instead of "She".
The priority of this issue should be raised. It is an issue that cripples the use of global search/replace. I changed this issue to "major" because of this.
Comment 4 Rainer Bielefeld Retired 2012-01-19 04:46:30 UTC
This is an Enhancement Request
Comment 5 Timur 2012-04-04 02:05:52 UTC Comment hidden (obsolete)
Comment 6 Roman Eisele 2012-04-18 02:01:42 UTC
Just to show that Timur is not the only user interested in this enhancement request, I would like to state that this issue is very important for me, too ...

Why? An example. Some months ago, I had to typeset a proceedings volume (a 'Sammelband' in German). I would have used LaTeX for this job, but for some external reasons I had to use a word processor; so I used Writer. In the long process of making uniform all chapters (regarding spelling/orthography, interpunctation, abbreviations, citation, formatting etc.), the present Regular Expressions feature of the Find & Replace function was very very helpful for me. Thanks for it!

But some special issues were not handled by the Regular Expressions feature as it stands; in some cases I remember I had to open the ODT file with BBEdit in order to search with advanced Regular Expressions for some remaining issues, and then to fix every occurence in Writer manually. If Writer's Regular Expressions feature was better, I could have just used Find & Replace to fix these errors ... This would have saved me some hours of work.
Comment 7 Kumāra 2013-02-07 09:13:46 UTC
(In reply to comment #4)
> This is an Enhancement Request

I agree. But having been used to the intuitiveness on commercial products (Word, WordPro), I get a feel like it's a bug. :-)
Comment 8 Timur 2013-05-17 17:25:27 UTC
I edited https://wiki.documentfoundation.org/Feature_Comparison:_LibreOffice_-_Microsoft_Office#Word_processors:_LibreOffice_Writer_vs._Microsoft_Word in order to add Advanced find & replace / Special characters. 
I marked this as partially supported in Writer and fully supported in Word.
Comment 9 msjasinski 2013-05-26 14:45:50 UTC
Maybe now that there is Sidebar from Symphony, Find & Replace could be inserted in it? This way it would be unobtrusive while searching/replacing; the good use of sidebar consisting in not covering/hovering over the main text body (an irk in MS Word). Ale the options from Alternative F & R transferred to the new functionality. It would be a huge project, but rewarding. Thank you.
Comment 10 Timur 2015-11-09 11:13:19 UTC
Extension http://extensions.libreoffice.org/extension-center/alternative-dialog-find-replace-for-writer still works. At least it could be bundled with the LO.
I agree that Find & Replace could be inserted in the Sidebar, as the 5th tab.
Comment 11 Manuel Lopez-Ibanez 2015-11-10 11:51:35 UTC
*** Bug 95708 has been marked as a duplicate of this bug. ***
Comment 12 Jax 2016-03-08 11:37:13 UTC
Bump.  This bugz me, and always has, that LO, & OO before it, don't have the complex Find & Replace features that are so much better handled in M$ Office.  The features of good F&R are pretty much crucial to my effective processing of documents, so I'd like to see this given fairly high priority.  F&R is a real 'guts' issue that underpins the bells and whistles stuff.

AltSearch 1.4.1 is available but is buggy, not able to complete some of the features that it has and really quite broken on some installations (I tried to install it on a friend's Mac on OO - presume that LO is the same).

I'm not a coder, but very willing to help test this.  Anyone?
Comment 13 Timur 2017-08-17 15:36:41 UTC
Heiko, can you please look at this one and note whether it make sense to stay open? I noticed some talk on Find&Replace in GSoC, if I'm right. 
Is Bug 100672 a duplicate? Looks so to me, but I may be biased. 
Please look at the duplicates here.
Comment 14 Heiko Tietze 2017-08-21 09:52:26 UTC
(In reply to Timur from comment #13)
> Heiko, can you please look at this one and note whether it make sense to
> stay open? I noticed some talk on Find&Replace in GSoC, if I'm right. 
> Is Bug 100672 a duplicate? Looks so to me, but I may be biased. 
> Please look at the duplicates here.

I understand this ticket as RegEx in F&R and 100672 as to add something special on top of it.

F&R (as well as the findbar) in the current implementation is the opposite to usability and we should think about a new UI. If you would do the implementation I'd be happy to do the design engineering.
Comment 15 Buovjaga 2018-09-29 17:30:32 UTC
*** Bug 119728 has been marked as a duplicate of this bug. ***
Comment 16 Buovjaga 2018-10-02 11:07:53 UTC
*** Bug 119924 has been marked as a duplicate of this bug. ***
Comment 17 Xisco Faulí 2019-11-29 13:28:19 UTC
Changing priority back to 'medium' since the number of duplicates is lower than 5
Comment 18 Timur 2019-12-04 20:32:10 UTC
*** Bug 129187 has been marked as a duplicate of this bug. ***
Comment 19 Timur 2021-01-13 16:00:53 UTC
*** Bug 139585 has been marked as a duplicate of this bug. ***
Comment 20 Adalbert Hanßen 2021-01-14 11:18:32 UTC
(In reply to Heiko Tietze from comment #14)
> (In reply to Timur from comment #13)
> > ...
> 
> I understand this ticket as RegEx in F&R and 100672 as to add something
> special on top of it.
> 
> F&R (as well as the findbar) in the current implementation is the opposite
> to usability and we should think about a new UI. If you would do the
> implementation I'd be happy to do the design engineering.

Looking through all the related duplicates, I see that Search&Replace is an issue for almost ten years. There is an ancestor bug report from 2003 in AOO!

Since writing regexes is not very easy, don't forget to make something like the "batch" feature (store and re-apply stored regexes, being able to give them reasonable names) is not forgotten!
Comment 21 V Stuart Foote 2021-03-13 16:58:19 UTC
*** Bug 140996 has been marked as a duplicate of this bug. ***
Comment 22 V Stuart Foote 2021-03-13 16:59:33 UTC
*** Bug 58744 has been marked as a duplicate of this bug. ***
Comment 23 rafael.linux.user 2021-04-21 11:27:33 UTC
(In reply to Adalbert Hanßen from comment #20)
> Looking through all the related duplicates, I see that Search&Replace is an
> issue for almost ten years. There is an ancestor bug report from 2003 in AOO!

We are in 2021 and no changes. When I explain LO Writer to my pupils and let them now this "must" have about to search "paragraph breaks", I end my class telling them "Maybe this will be solved in future, just after MSOffice become open source." That's my hope about making that search to work from LO  :(
Comment 24 Paolo Benvenuto 2022-03-23 16:09:02 UTC
since there isn't any way to replace something with a line break, I'd consider this a bug.

I consider it is worth some attention by the project!

Maybe a bounty could help? I'd participate in it!
Comment 25 Adalbert Hanßen 2022-03-23 19:58:00 UTC
(In reply to Jax from comment #12)
> Bump.  This bugz me, and always has, that LO, & OO before it, don't have the
> complex Find & Replace features that are so much better handled in M$
> Office.  The features of good F&R are pretty much crucial to my effective
> processing of documents, so I'd like to see this given fairly high priority.
> F&R is a real 'guts' issue that underpins the bells and whistles stuff.
> 
> AltSearch 1.4.1 is available but is buggy, not able to complete some of the
> features that it has and really quite broken on some installations (I tried
> to install it on a friend's Mac on OO - presume that LO is the same).
> 
> I'm not a coder, but very willing to help test this.  Anyone?

"Now", i.e. since 2017, AltSearch 1.4.2 is available. It has some important features:

* it can search for regular expressions. Unfortunately the ? quantifier (at least once, but you may circumvent that one by {1,}) and the * quantifier (any number of times including zero times) don't work, also lookback and lookahead-conditions are not supported (e.g. (?<=-)(\d{2})(?=-)  meaning exact 2 digits with a minus sign before and after them which are not caught by the regex)

* searching for \p and replacing it by \n lets you exchange paragraph marks by forced line feeds, a function which I use very often and which misses terribly in LO Writer and which alone probably would be easily implementabel,

* it can search for all occurrences of a given character prototype. Once they are found and marked (unfortunately this takes a lot of time) you can assign just another character prototype to all found samples (LO Writer can't do that),

* it can search for all occurrences of a given paragraph prototype. Once they are found and marked (unfortunately this takes a lot of time) you can assign just another paragraph prototype to all found paragraphs (LO Writer also can do that),

* it can search for a whole range of other text properties e.g. italics, underline, hyperlinks, ... (I have not tried them out).

For men, replace any paragraph marks (including any number of white spaces before them) by a linefeed is the most used function of AltSearch.

Due to the implementation in some type of Basic interpreter, AltSearch is very slow. Unfortunately it is not free from bugs. The worst one I know about is that it might remove the whole highlighted region if it is part of a table. To safeguard me from this bug, I always extract the whole range where AltSearch shall be applied (if it is part of a table) to a new document and do the replacements there. When done, I cut it from the intermediate document and paste it to the table where the stuff belongs.

Meanwhile LibreOffice has some regular expression capabilities. Lookahead and lookbehind work. Unfortunately \1 \2, \3... for capture groups don't work in the replacement part (one would e.g. use this to transform date in the form mm/dd/yyyy to dd.mm.yyyy or yyyy-mm-dd).

The important missing thing in LO's replace dialog with respect to regular expressions is the ability to store regexes + replacement rules and assign names to them in order to reuse them. AltSearch shows a good model for an user interface for that.
Comment 26 bunkem 2023-10-04 13:31:07 UTC
Still outstanding since 2011?  That's unfortunate.

I just came across this shortfall this morning and was disappointed to find that I need to find an extension to do the functionality that LO should do out of the box.  Using AltSearch will help on the Writer side but as it doesn't work in Impress or Calc, this will be a never ending frustration.  

Can we do something to increase the priority?  The suggestion of a bounty was also good.
Comment 27 Timur 2023-11-03 10:29:35 UTC
AltSearch 1.4.2 seems not to work from LO 7.5.
Comment 28 Adalbert Hanßen 2024-02-06 22:44:51 UTC
It works, but not very stable. From time to time it issues error messages and you end up in a macro definition screen. Also AltSearch is terribly slow.

The AltSearch functionality including its ability to store and to reapply stored search patterns plus the ability to uses PCRE regex expressions with all quantifiers (see comment 25) plus the ability to search for character styles, paragraph styles and son on should be in LO Writer itself.

The syntax and the way storing/retrieving search patterns should follow the model from  AltSearch 1.4.2. That's designed very well. But it has been realized  not so well.
Comment 29 Timur 2024-05-30 13:59:45 UTC
*** Bug 161029 has been marked as a duplicate of this bug. ***
Comment 30 Heiko Tietze 2024-05-31 07:44:33 UTC
Suggestion on bug 161029 was to make this a topic for documentation. Implementation issues should be handled elsewhere.
Comment 31 fpy 2024-08-15 07:12:05 UTC
(In reply to Heiko Tietze from comment #30)
> ... a topic for documentation.

- on remaining confusion with "wildcards" : https://community.documentfoundation.org/t/lo-7-6-writer-guide-published/11513/14
- and try to move forward with the help : https://gerrit.libreoffice.org/c/help/+/171538
Comment 32 Commit Notification 2024-08-23 20:06:16 UTC
Pierre F committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/e3caa53e99709b7099611b67cf73e9bdbd8801ea

more (simple) regex examples + fix note on paragraph limitation. tdf#38261, tdf#159607