Bug 153738 - Large text file - search and replace extremely slow
Summary: Large text file - search and replace extremely slow
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.3.2.2 release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-19 15:56 UTC by P Cunningham
Modified: 2023-02-19 22:22 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Text data from a dos database of image information in 8 fields for each record (244.01 KB, application/vnd.oasis.opendocument.text)
2023-02-19 15:56 UTC, P Cunningham
Details

Note You need to log in before you can comment on or make changes to this bug.
Description P Cunningham 2023-02-19 15:56:21 UTC
Created attachment 185474 [details]
Text data from a dos database of image information in 8 fields for each record

I am trying to import data (over 30,000 records) from an old dos system. I have exported the data into a text file from within the dos programme, with a paragraph separator - the options are limited and none of the others will work. So the end-of-field and end-of-record separators are both $ (or\n). The file opens OK in Notepad.

I need to import into Calc, but I have to search and replace in Writer to organise the data into an importable form. Each record begins with '10' which doesn't appear after a line break anywhere else, so I search for ^10 and replace with @@10, to identify all the end-of-records. Then I want to replace the remaining end of paragraphs with tabs, so I search for $ and replace with \t. This was taking an immense time - hours - so I force-quit and instead searched for $ and replaced with %% to enable me to later replace %% with \t. I was of course using regular expressions.

So now I want to search for @@ and replace with \n. But again this is taking hours.

I am attaching a file which is only a quarter of the data.

I have also tried using LO 7.5.0.3 on a different machine.
Comment 1 P Cunningham 2023-02-19 16:29:44 UTC
I have now resolved my problem by importing a CSV file from the old programme direct into Calc. However, it should have been possible to do what I was trying to do in Writer.

I have no objection to this being marked as Resolved, but you might want to experiment with the file I sent to identify the issue in Writer.
Comment 2 m_a_riosv 2023-02-19 22:22:04 UTC
Maybe the issue is that the $ substitution it's forcing to redo the text position for the whole text for every $ substitution. So it takes a lot of time, there are 55015 ends of paragraph.