Bug 105109 - Search and Replace does not find empty paragraphs in textfile with CRLF line endings
Summary: Search and Replace does not find empty paragraphs in textfile with CRLF line ...
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.6.7.2 release
Hardware: x86 (IA32) Linux (All)
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Find-Search
  Show dependency treegraph
 
Reported: 2017-01-04 19:43 UTC by Varga Péter
Modified: 2022-06-01 06:21 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Text file with empty paragraphs and DOS line endings (24 bytes, text/plain)
2017-01-05 11:08 UTC, Buovjaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Varga Péter 2017-01-04 19:43:34 UTC
Description:
Search and Replace does not find empty paragraphs in textfile with CRLF line endings. Works okay when creating new file and when opening file with only CR endings.

Steps to Reproduce:
1. Create new txt file on Linux like:

line one
<empy line, just hit Enter>
<empy line, just hit Enter>
line two

Likely it is going to have CR-only line endings.
2. Load it into Writer (File/Open or New Document->Insert->Document
3. Open Search and Replace (Ctrl-H), check regular expressions and search for ^$ (empty paragraph)
4. It works.
5. Close file, change it to have CR/LF line endings, for exaple with the unix2dos command line tool.
6. Repeat steps 2-3.
7. Search phrase not found.

Tried with 5.2.3.2 Build: 1:5.2.3~rc2-0ubuntu1~xenial1 installed from 
deb http://ppa.launchpad.net/libreoffice/ppa/ubuntu xenial main

No idea if other regexps work.


Actual Results:  
Search and Replace does not find empty paragraphs.

Expected Results:
Search and Replace should find empty paragraphs.


Reproducible: Always

User Profile Reset: No, it was a fresh test install of Xubuntu 16.04

Additional Info:


User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/55.0.2883.87 Chrome/55.0.2883.87 Safari/537.36
Comment 1 Buovjaga 2017-01-05 11:08:08 UTC
Created attachment 130171 [details]
Text file with empty paragraphs and DOS line endings

I converted it with Kate editor Tools - End of line.

LibreOffice 3.3 actually asks me about the line endings on import and the finding works, if I say CR & LF! I wonder, why this feature was removed. It is already gone in 3.6.

Bug 31480 is somewhat relevant.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.4.0.0.alpha0+
Build ID: 1a58cdf8af1aba52ce0a376666dd7d742234d7cf
CPU Threads: 8; OS Version: Linux 4.8; UI Render: default; VCL: kde4; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on January 4th 2016

Arch Linux 64-bit
Version 3.6.7.2 (Build ID: e183d5b)

Arch Linux 64-bit
LibreOffice 3.3.0 
OOO330m19 (Build:6)
tag libreoffice-3.3.0.4
Comment 2 Varga Péter 2017-01-05 20:08:44 UTC
(In reply to Buovjaga from comment #1)
I think the main concern is not the removal of line endings dialog but the fact that the regexp search apparently depends on something it is not supposed to.
Comment 3 QA Administrators 2018-01-06 03:31:17 UTC Comment hidden (obsolete)
Comment 4 Varga Péter 2018-01-06 06:24:41 UTC
As asked for in Comment 3, re-checked, repeating stetps in original report.

Bug present in:

Latest release running on Ubuntu 17.04:
Version: 5.4.4.2
Build ID: 2524958677847fb3bb44820e40380acbe820f960
CPU threads: 1; OS: Linux 4.10; UI render: default; VCL: gtk2; 
Locale: en-US (C); Calc: group

Version: 5.4.4.2
Build ID: 2524958677847fb3bb44820e40380acbe820f960
CPU threads: 1; OS: Linux 4.10; UI render: default; VCL: gtk2; 
Locale: hu-HU (hu_HU.UTF-8); Calc: group


Release coming with Ubuntu 17.04:
Verzió: 5.3.1.2
Build az.: 1:5.3.1-0ubuntu2
CPU szálak: 1; Rendszer verziója: Linux 4.10; Felületmegjelenítés: alapértelmezett; VCL: gtk3; Elrendezésmotor:új; 
Területi beállítások: hu-HU (hu_HU.UTF-8); Calc: group

Release coming with Ubuntu 16.04:
Verzió: 5.1.6.2
Build az.: 1:5.1.6~rc2-0ubuntu1~xenial2
CPU szálak: 4; Rendszer verziója: Linux 4.4; Felületmegjelenítés: alapértelmezett; 
Területi beállítások: hu-HU (hu_HU.UTF-8); Calc: group
Comment 5 QA Administrators 2019-01-07 03:41:43 UTC Comment hidden (obsolete)
Comment 6 QA Administrators 2021-12-19 03:48:28 UTC Comment hidden (obsolete)
Comment 7 Varga Péter 2021-12-19 07:23:44 UTC
Version: 7.2.4.1 (x64) / LibreOffice Community
Build ID: 27d75539669ac387bb498e35313b970b7fe9c4f9
CPU threads: 1; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win
Locale: en-GB (en_GB); UI: en-GB
Calc: threaded
Comment 8 Mike Kaganski 2022-06-01 06:18:56 UTC
This was not a bug at all. Putting it here just for completeness, to help archaeology.

(In reply to Buovjaga from comment #1)
> LibreOffice 3.3 actually asks me about the line endings on import and the
> finding works, if I say CR & LF! I wonder, why this feature was removed. It
> is already gone in 3.6.

The default filter for TXT files is now "Text", unlike the previous default choice of "Text - Choose Encoding" (previously "Text Encoded"). The latter can be chosen manually when opening from LibreOffice's File Open dialog, to allow customizing. Compare to the long-standing request tdf#74580 to skip the dialog when opening CSV.

(In reply to Varga Péter from comment #0)
> ... search for ^$ (empty paragraph)
> 4. It works.
> 5. Close file, change it to have CR/LF line endings, for exaple with the
> unix2dos command line tool.
> 6. Repeat steps 2-3.
> 7. Search phrase not found.

This should *not* work. When you open such a file without explicitly saying that CRLF is the paragraph boundary, the file is opened using system line ending convention for paragraph boundaries. Thus, on Linux only LF will be detected, and CR will become a paragraph contents, thus the paragraphs won't be empty (they will include one invisible character, which is e.g. possible to find using \x0D or [:control:] regex; also simple navigating through the document using right arrow key would show that there is one character - the cursor will not travel to th next paragraph immediately after it reached the end of current paragraph, and one more keypress will be needed).
Comment 9 Mike Kaganski 2022-06-01 06:21:09 UTC
FTR: this normal behavior can be also reproduced on Windows, if you open a CRLF text file using "Text - Choose Encoding" filter, and choose "LF" as paragraph break.