Bug 153408 - F&R's Replace All incorrectly replaces a character after an anchor appeared after the first replacement
Summary: F&R's Replace All incorrectly replaces a character after an anchor appeared a...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Find-Search
  Show dependency treegraph
 
Reported: 2023-02-06 11:37 UTC by Mike Kaganski
Modified: 2024-10-08 15:07 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2023-02-06 11:37:05 UTC
In a text document:

1. Type "aaaab cccccdeeeee" in a new paragraph;
2. Edit->Find and Replace (Ctrl+H);
3. Enable regular expressions;
4. Use this regex: "^a", replace with nothing, Replace All

==> expected result: "aaab cccccdeeeee" (one occurrence of "a" at the start of the paragraph is removed).
    actual result: "b cccccdeeeee" (all the "a"s are removed from the beginning, as if the search was restarted after each replacement).

5. Use this regex: "\bc", replace with nothing, Replace All

==> expected result: "b ccccdeeeee" (one occurrence of "c" at the start of a word is removed).
    actual result: "b deeeee" (all the "c"s are removed from the beginning od the word, as if the search was restarted after each replacement).

6. Use this regex: "e$", replace with nothing, Replace All

==> expected and actual result: "b deeee" (one occurrence of "e" at the end of the paragraph is removed).

==========

The user-visible result of "replace all" must match the behavior of regex' "g" flag: all matches are fund first, then each match gets replaced. This means, that in the string "aaaab cccccdeeeee", the regex "^a" only matches one "a" character in the beginning, and all the other "a" characters do not match in this particular "Replace All" operation. This must also match the behavior of "Find All" operation: i.e., all the same selected pieces found by "Find All" must be replaced in Replace All, not more.

Note also anther internal inconsistency, where step 6 does not suffer from a similar flaw, and only replaces one last character, not every character that becomes the last after the previous replacement.

The expected behavior can be tested e.g. at https://regex101.com/r/6beT3Y/1
Comment 1 Werner Tietz 2023-02-06 11:45:20 UTC
can confirm with:
Version: 7.0.4.2
Build ID: 00(Build:2)
CPU threads: 4; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: de-DE (de_DE.UTF-8); UI: de-DE
Raspbian package version: 1:7.0.4-4+rpi1+deb11u3
Calc: threaded
Comment 2 Eike Rathke 2023-02-06 13:12:51 UTC
As that does not happen with a single Replace it looks like Writer for ReplaceAll does a repeated replace (in fact the dialog even says so, "Search key replaced 3 times.") and it's not directly regex related, but Writer should detect such anchored removal. Replacing ^a with a also does it only once.
Comment 3 Buovjaga 2024-10-08 15:07:53 UTC
Still repro. Already in 3.5.

Arch Linux 64-bit
Version: 25.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 40beeb144a00c9725cde4239c251f67c658d31a8
CPU threads: 8; OS: Linux 6.10; UI render: default; VCL: kf6 (cairo+wayland)
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: CL threaded
Built on 6 October 2024