Bug 49848 - FILEOPEN: Worse-than Linear Performance Degradation Opening Change-Tracked ODTs
Summary: FILEOPEN: Worse-than Linear Performance Degradation Opening Change-Tracked ODTs
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.3 release
Hardware: Other All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: BSA
Keywords: perf
Depends on:
Blocks: Track-Changes
  Show dependency treegraph
 
Reported: 2012-05-12 11:45 UTC by orcmid
Modified: 2023-02-27 03:20 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
Spreadsheet of Timing Tests showing degradation with document growth (15.32 KB, application/vnd.oasis.opendocument.spreadsheet)
2012-05-12 11:45 UTC, orcmid
Details
Additional Test Result (31.50 KB, application/vnd.oasis.opendocument.spreadsheet)
2015-01-05 22:16 UTC, orcmid
Details

Note You need to log in before you can comment on or make changes to this bug.
Description orcmid 2012-05-12 11:45:00 UTC
Created attachment 61518 [details]
Spreadsheet of Timing Tests showing degradation with document growth

Problem description: 
SUMMARY

There is a worse-than linear decrease in document opening and
saving performance when additional change tracking is added in 
a progression of draft changes to an original document.

At some point, the degradation is so bad that an user is 
likely to assume that the software has hung and is failing to
open the document.  On slower machines than the one the 
documents were created on, this delay can be hours, not just
too many minutes.

TEST DOCUMENTS

There are five test documents, WD03a, WD03b, WD03c, WD04a, and 
WD03x.

They are all available here:
<http://tools.oasis-open.org/version-control/svn/oic/TestSuite/trunk/odf12/ChangeTrackingResilience/>.

If you want to know what WD03c is supposed to look like, there 
is a PDF available here:
<http://www.oasis-open.org/apps/org/workgroup/office/document.php?document_id=45946>.
It is a large file, but it opens quickly in Acrobat.

To know what the 223 tracked changes are, you can also check
Section 2 of the smaller file available here:
<http://www.oasis-open.org/apps/org/workgroup/office/document.php?document_id=45936>.
It is an ODF Text (.ODT) file.

DEMONSTRATION OF THE WORSE-THAN LINEAR DEGRADATION

The defect is demonstrated by timed opening of 4 documents 
that have an increasing number of tracked changes.

 * WD03a is 476kB and has 169 changes.  It opens in around 15
   seconds on a fast system.

 * WD03b is 746kB and the number of changes is raised to 207.  
   It takes a few minutes to open the document (roughly 16x as 
   long as for WD03a on a fast system).

 * WD03c is 1,132kB and it has 223 changes.  It takes roughtly
   4x more than WD03b. On a slower Windows XP SP3 x86 system, 
   it takes more than an hour to open the document.

 * WD04a is 1,343kB although it has no more tracked changes   
   and was only updated enough to start a new working draft 
   set.  Yet it is 200kb larger    and it takes almost double 
   the time over that for WD03c.  On the slowest system 
   used, it takes 2.5 hours.

 THE SPREADSHEET (attached) will provide timing 
statistics and the the different configurations and software releases on which measurements were captured.
Comment 1 orcmid 2012-05-12 11:47:29 UTC
SPREADSHEET DETAILS 

THIS ODF SPREADSHEET provides more data points with regard
 to the timings and the different configurations and software releases tested.

 Different releases of LibreOffice are employed, depending on what was
 handy for testing with different platforms.

 When available, I provide timing tests with OpenOffice.org 3.3.0 as well.
 This is to provide a baseline and confimr that the problem has existed since
 at least that releast of OpenOffice.org.

NOTE: THE MEASURED TIMINGS ARE NOT SUITABLE FOR COMPARISONS BETWEEN PRODUCTS.
  These timings were determined manually with a stop watch.  The conditions
  were not carefully controlled and the typical variances related to
  configuration differences, background activity, and system state are too
  high.
     The sole purpose of the timings is to demonstrate that the degradation
  of performance is consistent and predictable across all OpenOffice-
  lineage software.  The variance between releases is negligible compared to
  the major source of degradation.

CONFIGURATIONS

Astraendo is a Dell XPS 9100 with Windows 7 en_US x64, 18GB RAM, and an
Intel i7-980x 3.33GHz 6-core processor.

Quadro is a Toshiba Satellite Tablet PC with Windows XP SP3, 1.5GM RAM, and
an Intel Pentium M (Celeron) 1.7GHz processor.

VVM is Vista Ultimate en_US x86 running in Virtual PC on Astraendo

Win8CP64 is Windows 8 Customer Preview en_US x64 running in VirtualBox on
Astraendo

Zorin Core is Zorin OS (Core, Debian/Ubuntu) x64 running in VirtualBox on
Astraendo

Zorin Edu is Zorin OS (Edu, Debian/Ubunto) x86 running in VirtualBox on
Astraendo

SPECIAL CASES

Package releases are those provided in a distribution.  These are not from
LibreOffice download sites and apparently there are odd failure cases with
those.

Open Fail means that the document went through all of the slow opening
process and when it appeared to be ending, the application simply closed
without the document ever being shown.

Close crash means that there was a crash on closing the document in the
application, with a report that the software had not closed properly and
work might have been lost.  The document was fine (it had not been touched)
but the lock file was still present in the file system.
Comment 2 orcmid 2012-05-12 12:55:37 UTC
This defect is apparently in common code inherited from OpenOffice.org in both LibreOffice and Apache OpenOffice: 
https://issues.apache.org/ooo/show_bug.cgi?id=119341

It appears that symptoms of this problem have been identified as far back as OpenOffice.org 1.0:
https://issues.apache.org/ooo/show_bug.cgi?id=29842
There are potentially multiple defects behind these, ones related to change-tracking on open (as well as autosave and save but not quite as slow) and to bloating of the .ODT file for no apparent reason.

All of the ODF Text documents, WD03a, ..., WD04a were produced with LibreOffice 3.3.2, the software being used for an ODF maintenance activity (no changes to the software are made during the project to avoid regressions).

The original document with which editing began is the ODF Text for the OASIS OpenDocument Format 1.1 Standard.  This document was produced by generator 
"StarOffice/8$Solaris_Sparc OpenOffice.org_project/680m5$Build-9114".
Comment 3 Michael Stahl (allotropia) 2012-09-13 13:09:09 UTC
the change tracking implementation in Writer is and always has been
embarrassing, nothing new here.  SwDoc::AppendRedline is at least
O(n^2), perhaps worse (fortunately i haven't looked at it lately).

file import isn't actually the worst performance problem;
finding that is easy, just grep for

  //JP 06.01.98: MUSS noch optimiert werden!!!
Comment 4 QA Administrators 2015-01-05 17:51:37 UTC Comment hidden (obsolete)
Comment 5 orcmid 2015-01-05 22:16:55 UTC
Created attachment 111796 [details]
Additional Test Result

(In reply to QA Administrators from comment #4)
[ ... ]
> If the bug is present, please leave a comment that includes the version of
> LibreOffice and your operating system, and any changes you see in the bug
> behavior
[ ... ]

The bug remains present.  

Machine: Dell XPS with Windows 8.1 Professional
LibreOffice: 4.3.5.2

In this comparison, the times seem longer than previously but the key is that the non-linearity has not changed.  The different in timings may be because of differences in configuration and the absence of controlled timing, so these should not be viewed as absolute comparisons.

The new attachment is an ODS file that adds the test runs just made.
Comment 6 QA Administrators 2016-11-08 10:34:31 UTC Comment hidden (obsolete)
Comment 7 NISZ LibreOffice Team 2021-02-26 12:18:36 UTC
This looks much better in bibisect-7.2, yet the basic pattern is still here.

Measuring on my not-so-new work laptop of Win10 i5-4310M @2.7Ghz with time OOO_EXIT_POST_STARTUP=1 instdir/program/swriter.exe $example.odt I got the following values (as a rough average of 3 consecutive measurements).

For comparison I copied values from the first attachment here:

		WD03a	WD03b	WD03c	WD04a	WD03x
LO 7.2 		00:12	00:25	01:31	02:26	00:11
Astraendo 3.3.2	10s	2m29s	10m09s	15m40s	9s

@Noel: this may interest you...
Comment 8 QA Administrators 2023-02-27 03:20:04 UTC Comment hidden (spam)