Bug 100967 - Writer takes very long for opening long 960 pages DOC that MSO opens instantly
Summary: Writer takes very long for opening long 960 pages DOC that MSO opens instantly
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: filter:doc, haveBacktrace, perf
: 100968 (view as bug list)
Depends on:
Blocks: DOC
  Show dependency treegraph
 
Reported: 2016-07-17 12:08 UTC by zahra
Modified: 2023-04-26 11:38 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
its a .doc document which its content is in farsi and arabic. (1.27 MB, application/x-7z-compressed)
2016-07-17 12:08 UTC, zahra
Details
Callgrind with 5.3 (6.82 MB, application/x-xz)
2016-07-25 05:21 UTC, Buovjaga
Details
Perf flamegraph (402.20 KB, image/svg+xml)
2019-07-30 14:14 UTC, Buovjaga
Details
DOCX created from DOC in MSO (3.03 MB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2021-07-30 07:41 UTC, Timur
Details

Note You need to log in before you can comment on or make changes to this bug.
Description zahra 2016-07-17 12:08:13 UTC
Created attachment 126247 [details]
its a .doc document which its content is in farsi and arabic.

hi. 
hi. 
i have a large .doc document with 960 pages when openning in libreoffice writer. 
my document content is in farsi and arabic and its size is more than 9 mb.
every time i want to open it with libreoffice writer, it needs one hour and 45 minutes for me to open! 
i use windows xp, but my sister tested with windows 7.
for her, the document opened after one hour and 15 minutes. 
i tested this document with microsoft word and nvda screen reader says protected document. 
i opened it with microsoft word in seconds! 
i tested with openoffice 4.1.2 and it took 40 minutes to open it. 
while, i have a .doc book with more than 2000 pages opened in libreoffice only with 15 minutes! 
and also another .doc document like this, with only 39 pages which needs 15 to 20 minutes. 
i dont understand the reason of many many different times in openning my document!
i appreciate your help to know the reason.
Comment 1 JoNi 2016-07-17 12:42:50 UTC
*** Bug 100968 has been marked as a duplicate of this bug. ***
Comment 2 JoNi 2016-07-17 13:01:26 UTC
confirmed on recent master 2016-07-16

note: small test cases are preferred
if you have a one page document which loads in half a minute with LO instead of a second in Word, that would be great.
Comment 3 m_a_riosv 2016-07-17 14:23:27 UTC
Using odt to save, works fine, only a few seconds to open.

Cleaning all direct format open just fine saving/reopening as doc.

As I remember for other bugs and threads in forums, LibreOffice doesn't like too much an intensive use of direct format, importing ms files.

So perhaps, cleaning direct format and setting up properly default format can reduce the need for a hard use of direct format.
Comment 4 Buovjaga 2016-07-23 18:37:27 UTC
Setting to NEW. Confirmed slowness here as well.

Arch Linux 64-bit, KDE Plasma 5
Version: 5.3.0.0.alpha0+
Build ID: 64d3270a89fd88d4d0cf70329af2c66f722fd95e
CPU Threads: 8; OS Version: Linux 4.6; UI Render: default; 
Locale: fi-FI (fi_FI.UTF-8); Calc: group
Built on July 22nd 2016
Comment 5 Buovjaga 2016-07-25 05:21:46 UTC
Created attachment 126393 [details]
Callgrind with 5.3

Got a callgrind of opening the file.. it took about 30 hours.
Comment 6 zahra 2017-01-22 06:00:43 UTC Comment hidden (obsolete)
Comment 7 QA Administrators 2018-01-23 03:22:19 UTC Comment hidden (obsolete)
Comment 8 zahra 2018-01-23 14:08:32 UTC
hello.
i tested my book with my favorite version of libreoffice (5.3.4.1) and the issue persists.
one hour and 45 minutes for openning the file, with 2gb ram and cpu Genuine Intel(R) CPU  T2250  @ 1.73GHz
Comment 9 craig@arno.com 2018-07-24 20:01:50 UTC
Same problem here with a 6 page "Will" document, did some experimenting to see what works/doesn't since my busted up password protected document -only- takes 2 minutes to open:

1. [works] OpenOffice 4.1.3 (Windows) Opens the 6 page document almost instantly.

2. [works] Using OpenOffice 4.1.3 to do a "SaveAs..." to a new filename, with same or new password, creates a document which LibreOffice x64 (5.x, 6.x) will open quickly. (Yes, this is a "recovery tactic")

3. [doesn't] Waiting for LibreOffice to open the offending document and then using LibreOffice to perform a "SaveAs..." just makes another "broken file" and takes twice as long to do as opening the "broken file", 1x to SaveAs... + 1x to get LibreOffice "unhung" so it will close.  Behavior under Ubuntu 16.04 LTS and Windows 10 x64 is the same. soffice.bin uses 100% of 1-CPU while performing it's shenanigans.

Observations:
a) The META-INF/Manifest.xml produced by #2 above is quite different from the original file with all the problems.  This different META-INF/Manifest.xml seems to work well with LibreOffice 5.x/6.x x64 on both Linux and Windows Platforms.

b) The META-INF/Manifest.xml produced by #3 matches the original file which causes all the problems on opening.

c) My file having all the problems was created using LibreOffice and is timestamped 27-Aug-2016.  I tend to keep LibreOffice up to date so it was produced using a "current version" around that date.

d) It's -really- nice to have an OpenDocument -STANDARD- for ODT type files so other tools can be used for just such occasions.  Who knows, maybe next time OpenOffice will be the one experiencing problems.  At least we have choices about what to try.

The OpenOffice site, in case you forgotten, is http://www.openoffice.org/
This should help you recover your documents so you can continue to use LibreOffice.
Comment 10 Buovjaga 2018-07-24 20:17:18 UTC Comment hidden (obsolete)
Comment 11 QA Administrators 2019-07-30 03:16:01 UTC Comment hidden (obsolete)
Comment 12 craig@arno.com 2019-07-30 06:13:08 UTC
Version: 6.2.4.2 (x64)
Build ID: 2412653d852ce75f65fbfa83fb7e7b669a126d64
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
Locale: en-US (en_US); UI-Language: en-US
Calc: threaded

Issue is improved, still significant delays, just not as bad as it was from my earlier report.  I didn't try Linux, just Windows.
Comment 13 Buovjaga 2019-07-30 12:19:52 UTC Comment hidden (obsolete)
Comment 14 Buovjaga 2019-07-30 14:14:27 UTC
Created attachment 153058 [details]
Perf flamegraph

I canceled it after some minutes as the data size was growing too large

Arch Linux 64-bit
Version: 6.4.0.0.alpha0+
Build ID: 4bd1b38633d6cb288eb559afc0ac6b961538ae60
CPU threads: 8; OS: Linux 5.2; UI render: default; VCL: gtk3; 
Locale: fi-FI (fi_FI.UTF-8); UI-Language: en-US
Calc: threaded
Built on 24 July 2019
Comment 15 craig@arno.com 2019-07-30 14:23:17 UTC
(In reply to Buovjaga from comment #13)
> (In reply to Buovjaga from comment #10)
> > Craig: your issue is a different one. As zahra mentioned, AOO is slow for
> > this particular .doc as well.
> > 
> > Please open a new report with your example file attached. If it contains
> > confidential information, create a sanitized copy:
> > https://wiki.documentfoundation.org/QA/Bugzilla/
> > Sanitizing_Files_Before_Submission#Sanitize_file_text
> 
> Craig: still waiting for you to open a new report with an example document.
> Thanks.

I can open a new report if that will help.

I can't provide a "sanitized" version to reproduce the bug.  When I remove highly sensitive financial information the document becomes "trivial" and the problem disappears.

We'll have to work off information provided here which behaviorally produces the same issue.
Comment 16 QA Administrators 2021-07-30 06:21:37 UTC Comment hidden (obsolete)
Comment 17 Timur 2021-07-30 07:41:23 UTC
Created attachment 173961 [details]
DOCX created from DOC in MSO

Reproduced for attached DOC. Maybe 10-15 minutes (as noticed in comment 12), very slow and unusable, unlike MSO that has a trick to open 1st page in an instant and then repaginate. If just 10 pages cut and saved in MSO, opens fast, so it seems about size.
DOCX created from DOC in MSO is reasonably slow for that size, like ODT, not nearly as much.
DOC has no password, so I remove it from the title.
Comment 18 craig@arno.com 2021-07-30 22:33:33 UTC
This bug still exists on my 3 page 83K password protected "Will" document which takes 1m8s to open using LibreOffice:

Version: 7.0.6.2 (x64)
Build ID: 144abb84a525d8e30c9dbbefa69cbbf2d8d4ae3b
CPU threads: 8; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

This new version of LibreOffice works well for my other applications (mostly Writer and Calc, some Base and Impress), just this one "interesting" password protected problem, which I hope will go away when I rewrite my Will document from scratch using a newer version of LibreOffice.
Comment 19 Roman Kuznetsov 2023-04-26 11:36:38 UTC
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 0ee9501c0b7dc1a291715fff9c1934b1c08cb654
CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Vulkan; VCL: win
Locale: ru-RU (ru_RU); UI: ru-RU
Calc: threaded

opens the file for 4 min 40 sec

so the problem is still here
Comment 20 Roman Kuznetsov 2023-04-26 11:38:36 UTC
Khaled, may be you will interesting with this problem