Bug 64991 - Opening a long RTL DOC file is extremely slow, while ok if resaved as DOCX in MS-Word
Summary: Opening a long RTL DOC file is extremely slow, while ok if resaved as DOCX in...
Status: REOPENED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.0.3.3 release
Hardware: All All
: medium major
Assignee: Not Assigned
URL:
Whiteboard: BSA interoperability target:7.3.0 tar...
Keywords: filter:doc, needsDevAdvice, perf
Depends on:
Blocks: DOC-RTL
  Show dependency treegraph
 
Reported: 2013-05-26 08:23 UTC by Fahad Al-Saidi
Modified: 2024-08-19 13:11 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
Issue not reproduced after resaving as DOCX in Word 2010 (468.68 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-08-26 09:23 UTC, Xisco Faulí
Details
DOC from description (1.46 MB, application/msword)
2021-08-23 01:07 UTC, Aron Budea
Details
Flamegraph (22.49 KB, application/x-bzip)
2021-08-23 18:20 UTC, Julien Nabet
Details
Flamegraph (128.87 KB, application/x-bzip)
2023-02-12 09:00 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Fahad Al-Saidi 2013-05-26 08:23:49 UTC
Problem description: 
When you try to open long RTL doc document, it takes ages to import and open. 

Steps to reproduce:
1. download this file: http://www.alargam.com/alquran/quran6236.rar
2. open the quran6236.doc 
3. Have lunch then back. The file will still be in importing mode. 


              
Operating System: Ubuntu
Version: 4.0.3.3 release
Comment 1 Thomas van der Meulen [retired] 2013-05-26 09:31:02 UTC
Thank you for your bug report, I can reproduce this bug running libreoffice Version: 4.1.0.0.beta1
Build ID: 3a2c2d2417101e45fe07cfd8358acf2204a98f3 on Mac osx 10.8.3. I let it run for 5 min and stoped it. after that i tested it on my windows 7 machine and after 13 minuts it got a screen saying that it had crashed.

opening with Word 2007 or Apple pages is not a problem (1 min to load). so the file isn't broken.
Comment 2 Julien Nabet 2014-07-29 19:52:23 UTC
On pc Debian x86-64 with master sources updated today, I could reproduce this.

Example of bt part retrieved at random:
#0  0x00002aaade8b86ce in boost::ptr_sequence_adapter<SwFltStackEntry, std::__debug::deque<void*, std::allocator<void*> >, boost::heap_clone_allocator>::operator[] (
    this=0x88a2a38, n=5898) at /home/julien/compile-libreoffice/libreoffice/workdir/UnpackedTarball/boost/boost/ptr_container/ptr_sequence_adapter.hpp:332
#1  0x00002aaade8b80c7 in SwFltControlStack::operator[] (this=0x88a2a30, nIndex=5898) at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/inc/fltshell.hxx:197
#2  0x00002aaade8ce541 in SwWW8FltControlStack::GetStackAttr (this=0x88a2a30, rPos=SwPosition (node 15, offset 28871), nWhich=22)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:1547
#3  0x00002aaade8ce2ae in SwWW8FltControlStack::GetFmtAttr (this=0x88a2a30, rPos=SwPosition (node 15, offset 28871), nWhich=22)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:1497
#4  0x00002aaade9617f4 in SwWW8ImplReader::GetFmtAttr (this=0x889a730, nWhich=22) at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par6.cxx:2642
#5  0x00002aaade8d4daf in SwWW8ImplReader::emulateMSWordAddTextToParagraph (this=0x889a730, 
    rAddString="الم (1) ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ (2) الَّذِينَ يُؤْمِنُونَ بِالْغَيْبِ وَيُقِيمُونَ الصَّلَاةَ وَمِمَّا رَزَقْنَاهُمْ يُنْفِقُونَ (3) وَالَّذِينَ يُؤْمِنُونَ بِمَا أُنْز"...) at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:3340
#6  0x00002aaade8d46b1 in SwWW8ImplReader::ReadPlainChars (this=0x889a730, rPos=@0x7ffffffed790: 384, nEnd=56361, nCpOfs=0)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:3136
#7  0x00002aaade8d54dd in SwWW8ImplReader::ReadChars (this=0x889a730, rPos=@0x7ffffffed790: 384, nNextAttr=56361, nTextEnd=723941, nCpOfs=0)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:3441
#8  0x00002aaade8d725d in SwWW8ImplReader::ReadText (this=0x889a730, nStartCp=0, nTextLen=723941, nType=MAN_MAINTEXT)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:3967
#9  0x00002aaade8dd983 in SwWW8ImplReader::CoreLoad (this=0x889a730, pGloss=0x0, rPos=SwPosition (node 9, offset 13))
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:5150
#10 0x00002aaade8e0855 in SwWW8ImplReader::LoadThroughDecryption (this=0x889a730, rPaM=SwPaM = {...}, pGloss=0x0)
    at /home/julien/compile-libreoffice/libreoffice/sw/source/filter/ww8/ww8par.cxx:5743
#11 0x00002aaade8e1e8d in SwWW8ImplReader::LoadDoc (this=0x889a730, rPaM=SwPaM = {...}, pGloss=0x0)
Comment 3 Xisco Faulí 2015-08-26 09:23:49 UTC
Created attachment 118192 [details]
Issue not reproduced after resaving as DOCX in Word 2010

Problem still present in

Version: 5.0.0.5
Build ID: 1b1a90865e348b492231e1c451437d7a15bb262b
Locale: es-ES (es_ES)

on Windows 7 (64-bit)

However, I can't reproduce the issue if I resave the document as DOCX in Word 2010.
Comment 4 QA Administrators 2016-09-20 10:26:04 UTC Comment hidden (obsolete)
Comment 5 Aron Budea 2016-11-22 08:37:27 UTC
Still extremely slow in 5.2.3.3.
Comment 6 Xisco Faulí 2017-01-26 12:09:29 UTC
It opens if 347bb1634b10eba577742fe8a7edb4b2dd69265d is reverted. Closing as RESOLVED DUPLICATED of bug 76219

*** This bug has been marked as a duplicate of bug 76219 ***
Comment 7 Timur 2021-06-09 14:47:14 UTC Comment hidden (obsolete)
Comment 8 Fahad Al-Saidi 2021-06-09 18:32:15 UTC Comment hidden (obsolete)
Comment 9 Fahad Al-Saidi 2021-06-09 18:34:05 UTC
I couldn't attach it, but here is the link to test.

https://web.archive.org/web/2015*/http://www.alargam.com/alquran/quran6236.rar
Comment 10 Fahad Al-Saidi 2021-08-21 17:50:38 UTC Comment hidden (obsolete)
Comment 11 Aron Budea 2021-08-23 01:07:43 UTC
Created attachment 174482 [details]
DOC from description

(In reply to Fahad Al-Saidi from comment #10)
> today I had a chance to test the fix. It took 5 minutes to open the file in
> my new machine (16 threads).
Yeah, unfortunately that fix, done for bug 104254 had to be reverted before 7.2.0 release. But anyway, I checked from the state when that fix was still in, and apparently it didn't affect this document.

Thanks for pointing to the doc again, I'm attaching it this time.
Comment 12 Julien Nabet 2021-08-23 18:20:10 UTC
Created attachment 174500 [details]
Flamegraph

Here's a Flamegraph retrieved on pc Debian x86-64 with master sources updated today (with enable-symbols, not enable-debug) + gen rendering.

I just waited to have about 70MB, the loading was still at the beginning.
Comment 13 Julien Nabet 2021-08-23 18:21:03 UTC
Noel: I attached a Flamegraph perf, thought you might be interested in this one to find some hints to optimize the loading.
Comment 14 Commit Notification 2021-08-25 08:48:21 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/c6883e7a031dec5fe3a365c4fd6adccff09696e5

tdf#64991 speed up loading large RTL documents

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2021-10-22 16:35:32 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/0922d1d2b0ba30d44eae311a8d0dc17345b8dcac

tdf#64991 speed up loading large RTL documents

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2021-10-22 19:04:50 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/5760cba3b276a372d6cccf3f6b6db7fb26c20351

tdf#64991 speed up loading large RTL documents

It will be available in 7.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2021-11-27 14:54:09 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/bec2de27f676092bffdf8a639497602a9d13f675

tdf#64991 speed up loading large RTL documents

It will be available in 7.4.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2021-11-29 18:25:53 UTC
Noel Grandin committed a patch related to this issue.
It has been pushed to "libreoffice-7-3":

https://git.libreoffice.org/core/commit/75950f3eb9517b8d5cce4a7e491ab031a1b3f0db

tdf#64991 speed up loading large RTL documents

It will be available in 7.3.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Stéphane Guillou (stragu) 2021-12-31 05:55:37 UTC
2 minutes to open in 7.3.0.1

Version: 7.3.0.1 / LibreOffice Community
Build ID: 840fe2f57ae5ad80d62bfa6e25550cb10ddabd1d
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Same in a recent master build:

Version: 7.4.0.0.alpha0+ / LibreOffice Community
Build ID: 7fe2ce55ab86cc7a32850fdf504e368c535949c3
CPU threads: 8; OS: Linux 5.4; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded

Definitely better than the reported 5 minutes.
Comment 20 Jean-Baptiste Faure 2022-01-22 15:26:20 UTC
If that bug has commits that fixes the problem, then it can't be a duplicate of a still open bug.

So change as Resolved Fixed.

Best regards. JBF
Comment 21 Fahad Al-Saidi 2022-01-22 18:21:42 UTC
Opening the file attached to this bug is very slow even after the commits by Noel  comparing to ms office. 
I don't think the problem is fixed.
Comment 22 Mosaab Alzoubi 2022-06-11 17:41:43 UTC
Still in 7.3.3.2
Comment 23 Eyal Rozenberg 2023-02-10 15:45:10 UTC
I just opened attachment 174482 [details] (quran6236.doc) in LO Writer. It took somewhere between 60 and 90 seconds, I believe, ,on my machine (Intel i5-7600K @ 4050 Hz, 16 GB RAM). Build info:

Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: ad387d5b984c6666906505d25685065f710ed55d
CPU threads: 4; OS: Linux 6.1; UI render: default; VCL: gtk3
Locale: he-IL (en_IL); UI: en-US

So, this is obviously better than the situation 2013, but then - it's possible that Fahad's system has a lot less RAM, and the CPU was likely slower. Also, Writer is extremely unresponsive after opening the document, it takes forever to scroll, to place the cursor for editing, to switch to Navigator on the sidebar etc.

This is quite unacceptable in a document which only has ~80K words, ~450K characters, and 318 pages.

Should we open a separate bug about the extreme slowness _after_ the document has been loaded?
Comment 24 Eyal Rozenberg 2023-02-10 15:46:20 UTC
(In reply to Eyal Rozenberg from comment #23)
> This is quite unacceptable in a document which only has ~80K words, ~450K
> characters, and 318 pages.

Whoops, that's 258 pages. I should also mention I can absolutely not get the Navigator to show, and LO is just stuck. :-(
Comment 25 Julien Nabet 2023-02-12 09:00:33 UTC
Created attachment 185331 [details]
Flamegraph

On pc Debian x86-64 with master sources updated today, I retrieved a new Flamegraph.