Bug 43107 - Regular expression "\n" in replace field inputs "\n" instead of line(paragraph?) break
Summary: Regular expression "\n" in replace field inputs "\n" instead of line(paragrap...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: target:7.5.0
Keywords:
: 43086 64144 152838 (view as bug list)
Depends on:
Blocks: 38261 Find-Search
  Show dependency treegraph
 
Reported: 2011-11-20 05:20 UTC by Chris Peñalver
Modified: 2023-09-18 09:42 UTC (History)
15 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Peñalver 2011-11-20 05:20:33 UTC
Downstream bug may be found at:
https://bugs.launchpad.net/ubuntu/+source/libreoffice/+bug/892476

1) lsb_release -rd
Description:	Ubuntu 11.10
Release:	11.10

2) apt-cache policy libreoffice-calc
libreoffice-calc:
  Installed: 1:3.4.4-0ubuntu1~ppa1
  Candidate: 1:3.4.4-0ubuntu1~ppa1
  Version table:
 *** 1:3.4.4-0ubuntu1~ppa1 0
        500 http://ppa.launchpad.net/libreoffice/ppa/ubuntu/ oneiric/main i386 Packages
        100 /var/lib/dpkg/status
     1:3.4.3-3ubuntu2 0
        500 http://us.archive.ubuntu.com/ubuntu/ oneiric/main i386 Packages

3) What is expected to happen in LibreOffice Calc is when one of the cells contains a line break -> Edit -> Find and Replace... -> in the "Search for" drop down \n -> in the "Replace with" drop down \n -> Regular expression button checked is the line break is replaced with a paragraph break as noted in http://help.libreoffice.org/Common/List_of_Regular_Expressions

4) What happens instead is the line break is replaced with the characters \n
Comment 1 Chris Peñalver 2011-11-20 05:21:38 UTC
*** Bug 43086 has been marked as a duplicate of this bug. ***
Comment 2 Winfried Donkers 2012-05-06 23:30:22 UTC
I think this is a duplicate of bug44398, i.e. isn't this a more general problem than just \n?

(behaviour confirmed with version 3.5.3release, details entered in comment with bug 44398)

*** This bug has been marked as a duplicate of bug 44398 ***
Comment 3 Chris Peñalver 2012-05-06 23:51:50 UTC
Winfried Donkers, I do not agree with this being a duplicate of bug 44398, nor do I agree with bug 44398 being accepted as a valid report for that matter based on the title.

Many, purposefully or ignorantly, make wide-scoped "fix everything about a feature set" reports, as 44398's title is, and your subsequent comment https://bugs.freedesktop.org/show_bug.cgi?id=44398#c2 . This makes a report less likely to be resolved quickly.

However, this (43107) report is targeted, specific, and detailed.

Please do not toggle this report further unless you are submitting a patch. Thank you.
Comment 4 Cor Nouws 2012-10-06 21:02:53 UTC
In 3.5.7 rc1 or 3.6.2.2 use "\n\n" in the replace field.
That works for me.
Can you pls check that?
(I remeber in the late OOo times there was a change in this, but can't find the details now).

I intend to close this one soon
Comment 5 Alex Willmer 2012-10-08 14:08:56 UTC
(In reply to comment #4)
> In 3.5.7 rc1 or 3.6.2.2 use "\n\n" in the replace field.
> That works for me.
> Can you pls check that?

In Calc 3.6.2.2 (Build ID: da8c1e6) on Windows 7 64-bit, if I
 - use "\n\n" (excluding quotes) in the replace field
 - Click Replace all

Calc inserts those literal characters into the cell(s), not newline characters. Based on that I don't think this bug is fixed.
Comment 6 cpohle 2012-11-06 13:58:49 UTC
Using Version 3.6.3.2 on Mac OSX, \n is inserted as literal text (i.e., not as a newline) as well.
Comment 7 Johnny Baloney 2013-03-22 17:50:12 UTC
Version 3.6.5.2 (Build ID: 5b93205) on Linux. Same problem - neither \n nor \n\n work.
Comment 8 Jorendc 2013-07-09 10:18:39 UTC
*** Bug 64144 has been marked as a duplicate of this bug. ***
Comment 9 Nemo_bis 2014-11-01 21:51:51 UTC
Confirmed in 4.2.6.3 on Fedora (Build ID: 4.2.6.3-8.fc20).
Comment 10 Patrick Smits 2015-09-17 15:45:36 UTC
This bug still exists in LO 5.0.1.2 on Windows 7.

It's a really painful bug, since manually entering cells, going to the right point and doing a Alt-Enter multiple times per cell is very time consuming, prone to error and above all very dull ;-)
Comment 11 QA Administrators 2017-09-01 11:15:48 UTC Comment hidden (noise)
Comment 12 DN 2017-09-02 15:49:31 UTC
Yes, it's still present, 5.3.6 on Fedora.
Comment 13 QA Administrators 2018-09-03 02:43:32 UTC Comment hidden (noise)
Comment 14 DN 2018-09-03 11:34:46 UTC
Still present in 6.0.6.2.
Comment 15 Ed Santiago 2018-09-03 23:22:33 UTC
Confirming: still present in 6.0.6.2 on Gentoo
Comment 16 Cor Nouws 2018-09-07 15:23:39 UTC
hmm - does anyone have a test document, with a cells containing line break and one with a new paragraph ?
Comment 17 DN 2018-09-07 15:45:47 UTC
> does anyone have a test document

You can't reproduce?

Steps:

* Create a blank sheet
* Double-click on a cell to inline-edit it
* Enter two-line content (e.g. "first line", <Ctrl>-<Enter>, "last line") and hit <Enter>
* Open find/replace (Ctrl-H)
* Expand "other options" and tick "regular expressions"
* In find, enter "\n" without quotation marks
* In replace, enter "\nmiddle line\n"
* Click "Replace all"

Expected:

* Cell contains three lines: "first line", "middle line", "last line"

Actual:

* Cell contains one line: "first line\nmiddle line\nlast line" (literal string)
Comment 18 Cor Nouws 2018-09-07 16:03:24 UTC
(In reply to DN from comment #17)

> * Create a blank sheet
> * Double-click on a cell to inline-edit it
> * Enter two-line content (e.g. "first line", <Ctrl>-<Enter>, "last line")
> and hit <Enter>

Help reads: : "\n is for line end entered with Shft+Enter"

Hence I do not have a file to test this exactly..

> * Cell contains one line: "first line\nmiddle line\nlast line" (literal
> string)

I see that too, of course, but..
Comment 19 Cor Nouws 2018-09-07 16:07:35 UTC
behavior was the same in version 3.3.0 and is in OpenOffice.
So, is this really a bug, or something that behaves different than expected??
Comment 20 DN 2018-09-07 16:25:04 UTC
> Help reads: : "\n is for line end entered with Shft+Enter"

(for anyone else reading - the help page by clicking Help on the search/replace window and clicking "List of Regular Expressions")

This is a documentation bug. This refers to Writer only, not Calc.

In *Writer*, <Enter> creates a "Paragraph break", whereas <Shift><Enter> creates a "line break". Search/replace works as per the "List of Regular Expressions" docs in Writer, specifically:

> \n
> Represents a line break that was inserted with the Shift+Enter key combination. To change a line break into a paragraph break, enter \n in the Find and Replace boxes, and then perform a search and replace.
> \n in the Find text box stands for a line break that was inserted with the Shift+Enter key combination.
> \n in the Replace text box stands for a paragraph break that can be entered with the Enter or Return key.



> Hence I do not have a file to test this exactly..

The steps I listed result in a doc to test this.



> behavior was the same in version 3.3.0 and is in OpenOffice.
> So, is this really a bug, or something that behaves different than expected??

It's a bug. The content of "replace" should be interpreted as a regex (i.e. \n should be interpreted as a newline, not a literal string). It may well have always been a bug in Open/LibreOffice.
Comment 21 Cor Nouws 2018-09-07 20:50:28 UTC
(In reply to DN from comment #20)
> > Help reads: : "\n is for line end entered with Shft+Enter"
> 
> (for anyone else reading - the help page by clicking Help on the
> search/replace window and clicking "List of Regular Expressions")
> 
> This is a documentation bug. This refers to Writer only, not Calc.

There is more. Who says Ctrl+Enter creates a new line in a Calc text cell?

Unzip a Calc file with a 'new line', and you see:
  <text:p>text on line one</text:p><text:p>And this on line too</text:p>

Unzip a Writer file with a 'new line', and you see:
  <text:p text:style-name="P1">This is line one<text:line-break/>and this is line two</text:p>

So, although the Calc cell UI does not know the paragraph concept, the xml file does.
Thus the question still stands: what are the bugs, and/or not yet implemented features? (see below)
 
> > Hence I do not have a file to test this exactly..
> 
> The steps I listed result in a doc to test this.

They don't - hence my request ;)
 
> > behavior was the same in version 3.3.0 and is in OpenOffice.
> > So, is this really a bug, or something that behaves different than expected??
> 
> It's a bug. The content of "replace" should be interpreted as a regex (i.e.
> \n should be interpreted as a newline, not a literal string). It may well
> have always been a bug in Open/LibreOffice.

No doubt that I agree, that one would expect that Find & Replace replaces 'line breaks'. In any case, in there is no line break, a result 'not found' would be more appropriate.

So bugs/problems are:
- a 'new line' in Calc text cells, produces a new paragraph;
  - it makes sense that, since Shift+Enter does the opposite of Enter,
   Ctrl+Enter is used for the line break (new line/paragraph)
  - maybe it should really create a line break?
- Find & Replace should either report: no new lines found; or behave different in Calc text, handling paragraphs as if it were line breaks;
- documentation is not clear/complete.

adding @eike and @regina to cc.
Comment 22 DN 2018-09-07 21:30:45 UTC
(In reply to Cor Nouws from comment #21)
> There is more. Who says Ctrl+Enter creates a new line in a Calc text cell?

It does. Did you test and see what happens?
 
> Unzip a Calc file with a 'new line', and you see:
>   <text:p>text on line one</text:p><text:p>And this on line too</text:p>
> 
> Unzip a Writer file with a 'new line', and you see:
>   <text:p text:style-name="P1">This is line one<text:line-break/>and this is
> line two</text:p>

This isn't relevant to whether this is a bug. What's relevant is what the user actually sees/experiences. Which is a newline. Which is *also* what's in the docs: https://help.libreoffice.org/Common/Inserting_Line_Breaks_in_Cells

> So, although the Calc cell UI does not know the paragraph concept, the xml
> file does.

Unless it's a fundamental limitation of ODS, this is also irrelevant in terms of this being a bug, and just means there is separate bug with saving to ODS. Even in the case of a fundamental limitation of the format, it doesn't negate this bug; it might make it CANTFIX at most.

> Thus the question still stands: what are the bugs, and/or not yet
> implemented features? (see below)
>  
> > > Hence I do not have a file to test this exactly..
> > 
> > The steps I listed result in a doc to test this.
> 
> They don't - hence my request ;)

Your previous comments made no mention of my steps not working. Please tell me how the outcome when you tried differed from my description.

 > > > behavior was the same in version 3.3.0 and is in OpenOffice.
> > > So, is this really a bug, or something that behaves different than expected??
> > 
> > It's a bug. The content of "replace" should be interpreted as a regex (i.e.
> > \n should be interpreted as a newline, not a literal string). It may well
> > have always been a bug in Open/LibreOffice.
> 
> No doubt that I agree, that one would expect that Find & Replace replaces
> 'line breaks'. In any case, in there is no line break, a result 'not found'
> would be more appropriate.

Covered above. Ctrl-Enter = line break per docs. Displays as line break. User wants a line break. Document saving bugs are irrelevant, unless they're the *cause* of this bug. A "not found" would be adding another effective bug (in terms of intended/documented behaviour) to align with other buggy behaviour elsewhere.

> 
> So bugs/problems are:
> - a 'new line' in Calc text cells, produces a new paragraph;

Separate bug re. ODS and maybe internal state.

Additionally, the *find* part works fine. The *replace* part is this bug.

>   - it makes sense that, since Shift+Enter does the opposite of Enter,
>    Ctrl+Enter is used for the line break (new line/paragraph)

It effectively does now (user experience/documentation). Again, nothing to do with this bug, which is about *replacement*.

>   - maybe it should really create a line break?

Not directly relevant, might be a dependency for this bug - see above.

> - Find & Replace should either report: no new lines found; or behave
> different in Calc text, handling paragraphs as if it were line breaks;

Or Ctrl-Enter should just actually insert a line break per the docs - possibly a separate bug. Again, this bug is about *replacement* being literal \n instead of an (effectively) line break.

> - documentation is not clear/complete.

Possibly, but tangential.

> 
> adding @eike and @regina to cc.
Comment 23 QA Administrators 2019-10-20 02:33:22 UTC Comment hidden (noise)
Comment 24 DN 2019-10-27 15:31:06 UTC
Still present, 6.2.8.2.
Comment 25 Mike Kaganski 2020-05-13 10:14:19 UTC
Documentation bug fixed in https://git.libreoffice.org/help/+/55d4e405d8dbdf58ba45823b3895e8a79e5a8aed.
Comment 26 Mike Kaganski 2022-10-10 12:29:11 UTC
(In reply to DN from comment #20)
> The content of "replace" should be interpreted as a regex (i.e.
> \n should be interpreted as a newline, not a literal string).

While I *don't* tell that this tdf#43107 is not a bug, I want to stress that it's incorrect to consider *replacement* string as a "regex" - no, it is never so. Only a search string is a regex. The replacement string is a special string that, as per documentation [1], may contain references.

We have an extension of that syntax; and there is bug 106137 to extend it further. I would argue that for consistency, exactly because in Calc, the newline in a cell inserts *paragraphs* (not only available in the file format, but also in the API; and that is not a bug), the \n in the replacement box should behave *consistently* with Writer, where it inserts paragraphs.

Just don't say that replacement string is a "regex" :)

[1] https://unicode-org.github.io/icu/userguide/strings/regexp.html#find-and-replace
Comment 27 Commit Notification 2022-10-10 15:08:23 UTC
Eike Rathke committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/help/commit/00a07b02f82b2177c6d1ad90168528ca9eb73be8

Related: tdf#43107 Clarify \n in Find and Replace
Comment 28 Rainer Bielefeld Retired 2023-01-02 15:21:16 UTC
*** Bug 152838 has been marked as a duplicate of this bug. ***
Comment 29 Chris Peñalver 2023-02-26 17:09:34 UTC
As the original reporter, I accept this being root caused to a documentation issue now fixed by:
https://bugs.documentfoundation.org/show_bug.cgi?id=43107#c27

I wouldn't have filed this report if it had been documented as such to begin with.