Bug 94810 - "Replace All" using regex gives wrong results
Summary: "Replace All" using regex gives wrong results
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: UI (show other bugs)
Version:
(earliest affected)
4.4.2.1 rc
Hardware: Other All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: haveBacktrace target:5.1.0 target:5.0...
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2015-10-06 01:33 UTC by Mike Kaganski
Modified: 2016-10-25 19:21 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
gdb with backtrace from raised assertion (13.51 KB, text/plain)
2015-10-06 20:48 UTC, Terrence Enger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2015-10-06 01:33:28 UTC
The problem is reproducible both in Calc and in Writer.

Steps to reproduce in Calc:

1. Create new spreadsheet
2. In A1, enter text "11 22 33 44 55 66" (without quotes), then press Enter
3. Open "Find&Replace" dialog (Ctrl+H), Search For "([^ ]*)[ ]*([^ ]*)" (without quotes), Replace With "$1$2" (without quotes), check Other Options->Regular Expressions.
4. Click "Replace All".

Expected result:
A1 should contain "112233445566"

Actual result:
A1 contains 1122 33 44 5533 553344 44 5533 55334455 33 44 5533 553344 44 5533 5533445566

Doing the same in Writer gives 1122333344333344553333443333445566.

Tested with 4.4.2.2 and Version: 5.0.2.2 (x64)
Build ID: 37b43f919e4de5eeaca9b9755ed688758a8251fe
Locale: ru-RU (ru_RU)
under Win7x64.

See also https://bz.apache.org/ooo/show_bug.cgi?id=107619
See also Bug 44861
Comment 1 Mike Kaganski 2015-10-06 12:14:34 UTC
Also reproducible with 4.4.2.1.

NOT reproducible with 4.4.1.2 and earlier -> regression.
Comment 2 Terrence Enger 2015-10-06 20:22:08 UTC
Working in the 50max bibisect repository, I see from `git bisect good` ...

    2cc6c03eeedc8f5f0739bcc8392cd172e9af01b8 is the first bad commit
    commit 2cc6c03eeedc8f5f0739bcc8392cd172e9af01b8
    Author: Matthew Francis <mjay.francis@gmail.com>
    Date:   Wed May 27 19:54:59 2015 +0800

        source-hash-806ced87cfe3da72df0d8e4faf5b82535fc7d1b7
    
        commit 806ced87cfe3da72df0d8e4faf5b82535fc7d1b7
        Author:     Michael Stahl <mstahl@redhat.com>
        AuthorDate: Mon Mar 9 21:32:43 2015 +0100
        Commit:     Michael Stahl <mstahl@redhat.com>
        CommitDate: Tue Mar 10 00:15:16 2015 +0100
    
            tdf#89665: i18npool: speed up TextSearch::searchForward()
    
            There does not appear to be a good reason why searchForward() needs to
            call transliterate() on the entire passed string.
    
            Restricting it to the passed range speeds it up from 104 billion to 0.19
            billion callgrind cycles when built with GCC 4.9.2 -m32 -Os.
    
            Change-Id: I440f16c34f38659b64f1eb60c50f0e414e3dfee8

    :040000 040000 220789e3c870f5ecff0c21f34b3819a18083eea1 be92315fc5a8964336fd4c06f08dc80df79eb6a3 M	opt

and from `git bisect log` ...

    # bad: [dda106fd616b7c0b8dc2370f6f1184501b01a49e] source-hash-0db96caf0fcce09b87621c11b584a6d81cc7df86
    # good: [5b9dd620df316345477f0b6e6c9ed8ada7b6c091] source-hash-2851ce5afd0f37764cbbc2c2a9a63c7adc844311
    git bisect start 'latest' 'oldest'
    # good: [0c30a2c797b249d0cd804cb71554946e2276b557] source-hash-45aaec8206182c16025cbcb20651ddbdf558b95d
    git bisect good 0c30a2c797b249d0cd804cb71554946e2276b557
    # bad: [2ce02b2ce56f12b9fcb9efbd380596975a3a5686] source-hash-17d714eef491bda2512ba8012e5b3067ca19a5be
    git bisect bad 2ce02b2ce56f12b9fcb9efbd380596975a3a5686
    # bad: [e4deb8a42948865b7b23d447c1547033cb54535b] source-hash-ce46c98dbeb3364684843daa5b269c74fce2af64
    git bisect bad e4deb8a42948865b7b23d447c1547033cb54535b
    # bad: [15e8b5cc6b4784fecd63b2a5a04ac086b3e9fc01] source-hash-26b500afcaed704db7a300836f466517c309ee77
    git bisect bad 15e8b5cc6b4784fecd63b2a5a04ac086b3e9fc01
    # skip: [534715525a93b0d7d56ba123d253c927cccf0afe] source-hash-40c9a46b78b8919aae82dd9b94774d63bb9cb4e6
    git bisect skip 534715525a93b0d7d56ba123d253c927cccf0afe
    # good: [4ef4ca77524446e7296bcb0124603c2009d2d2dc] source-hash-dedc93e973b59ca4d1660fc3820770bf9b072896
    git bisect good 4ef4ca77524446e7296bcb0124603c2009d2d2dc
    # good: [0bc507c5c8ecc463ed97e02323ba91a8bd4ab47e] source-hash-d44168795aed842d524e3a349962f2b98a8ac504
    git bisect good 0bc507c5c8ecc463ed97e02323ba91a8bd4ab47e
    # bad: [6e38abca722c608ae4771352b637cc241aa9afe1] source-hash-dd5a1ca5e476fc9f24936cc227c83a9d1aeab056
    git bisect bad 6e38abca722c608ae4771352b637cc241aa9afe1
    # bad: [170992451f280adf91ef0c580aee0945e6cc0c54] source-hash-9aae521b451269007f03527c83645b8b935eb419
    git bisect bad 170992451f280adf91ef0c580aee0945e6cc0c54
    # bad: [3b41e84d3acdb6a3ce0fe3f15ecbcf11e09d2e21] source-hash-ddc1f7d9a816e2cc970d48d2ccc2c0cd256e6e03
    git bisect bad 3b41e84d3acdb6a3ce0fe3f15ecbcf11e09d2e21
    # bad: [ac6ceff1634f768173c9c0cf87d2c633feb64067] source-hash-422fdeccd88a89461271bd6d87774a4c5015ba60
    git bisect bad ac6ceff1634f768173c9c0cf87d2c633feb64067
    # bad: [18d82af8b687c5f7643f0cf5c08e8baccf13b1e0] source-hash-15174177091367332b57cd79575e2f7dd27388b2
    git bisect bad 18d82af8b687c5f7643f0cf5c08e8baccf13b1e0
    # bad: [2cc6c03eeedc8f5f0739bcc8392cd172e9af01b8] source-hash-806ced87cfe3da72df0d8e4faf5b82535fc7d1b7
    git bisect bad 2cc6c03eeedc8f5f0739bcc8392cd172e9af01b8
    # good: [812995885044f7720fe2d68e36a1e016b2162998] source-hash-d22519f62bcd1325f1e7cc920a115b68fccd1922
    git bisect good 812995885044f7720fe2d68e36a1e016b2162998
    # first bad commit: [2cc6c03eeedc8f5f0739bcc8392cd172e9af01b8] source-hash-806ced87cfe3da72df0d8e4faf5b82535fc7d1b7
Comment 3 Terrence Enger 2015-10-06 20:48:54 UTC
Created attachment 119371 [details]
gdb with backtrace from raised assertion

Both the daily dbgutil repository versions 2015-05-20 and 2015-10-06
failed with SIGABRT as I was testing <Replace All> in Calc.

On the assumption that my raised assertion is closely related to the
incorrect results from <Replace All>, I attatch this gdb run from
local built commit 778216d, fetched 2015-09-21 02:45 UTC, configured

    CC=ccache /home/terry/lo_hacking/associated/gcc/bin/gcc
    CXX=ccache /home/terry/lo_hacking/associated/gcc/bin/g++
    --enable-option-checking=fatal --enable-dbgutil --enable-crashdump
    --without-system-postgresql --without-myspell-dicts
    --with-extra-buildid --without-doxygen
    --with-external-tar=/home/terry/lo_hacking/git/src
    --disable-gstreamer-1-0 --enable-gstreamer-0-10 --disable-gtk3

built on debian wheezy with local-built gcc 5.2.0 and its libraries,
running in an environment chroot'd to debian-sid.
Comment 4 Terrence Enger 2015-10-06 20:50:51 UTC
Adding haveBacktrace in whiteboard.
Comment 5 Mike Kaganski 2015-11-08 12:28:06 UTC
Terrence Enger, your comment 3 is absolutely right: the raised assertion is related to the issue. Thank you!

Sent a patch to gerrit:
https://gerrit.libreoffice.org/19840
Comment 6 Commit Notification 2015-11-23 19:05:04 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4cf1d290bab29e18e1312b63ff862f5102e00387

tdf#94810: fix reverse offset mapping

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2015-11-23 19:14:38 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=67ab2ce3c6fed2ceaacaad890a7d8683ce0397a7

remove comment that makes no sense, tdf#94810 follow-up

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2015-11-23 20:35:31 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=ce91f3c1292f3e9b84157acf10b67ad9ca16719d

similar to searchForward() use the correct offsets, tdf#94810 related

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 Commit Notification 2015-11-24 10:59:37 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=028d7d8476980988e382d7b2fc0782946fbac740&h=libreoffice-5-0

tdf#94810: fix reverse offset mapping

It will be available in 5.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2015-11-25 15:07:20 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4dd2d40673299966ad639d799e925e64ae5560cf

regex result offsets can be negative if a group was not matched, tdf#94810

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2015-11-25 15:43:45 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=6750f5ad6515ea83a2c22c9af89be570abc3aecd&h=libreoffice-5-1

regex result offsets can be negative if a group was not matched, tdf#94810

It will be available in 5.1.0.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2015-11-26 21:04:29 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=a38963a40eed7f1d85767b393a316870c31bff5c&h=libreoffice-5-0

regex result offsets can be negative if a group was not matched, tdf#94810

It will be available in 5.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Terrence Enger 2015-12-05 00:17:44 UTC
I see the expected results with daily dbgutil bibisect version
2015-12-04.  Thank you, Eike.