Bug 88005 - FILEOPEN Can't open particular WW8 doc file, layout loops due to wrongly imported frame
Summary: FILEOPEN Can't open particular WW8 doc file, layout loops due to wrongly impo...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.2.7.2 release
Hardware: Other All
: medium normal
Assignee: Michael Stahl (CIB)
URL:
Whiteboard: target:4.5.0 target:4.3.6 target:4.4....
Keywords: bibisected, bisected, filter:doc, regression
Depends on:
Blocks:
 
Reported: 2015-01-03 23:57 UTC by Maxim Britov
Modified: 2015-12-17 05:57 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:


Attachments
problem file (2.37 MB, application/msword)
2015-01-03 23:57 UTC, Maxim Britov
Details
bt with debug symbols (4.4) (24.35 KB, text/plain)
2015-01-04 13:37 UTC, Julien Nabet
Details
doc file with two images without pagebreak. LO can't open it. (502.00 KB, application/msword)
2015-01-06 08:38 UTC, Maxim Britov
Details
docx file with two images without pagebreak. LO can open it. (488.74 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2015-01-06 08:39 UTC, Maxim Britov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Britov 2015-01-03 23:57:47 UTC
Created attachment 111704 [details]
problem file

There is doc file with scanned pages.
LO 4.3.5.2 and 4.4.0.1 can't open it. 100% cpu load, grow memory usage.
If I open it in mso 2007 and save as docx, LO can open it fine.
Comment 1 MM 2015-01-04 07:23:12 UTC
Unconfirmed with v4.2.6.3 under mint 16 x64
Confirmed with v4.2.7.2 on mint 17.1 x64
Confirmed with v4.3.5.2 on mint 17.1 x64

Can actually open the file on all versions, but processing takes forever on 4.2.7.2 and above.

Set to new, regression.
Comment 2 Julien Nabet 2015-01-04 13:15:47 UTC
On pc Debian x86-64 with master sources updated yesterday, I could open the file but seems empty.

I noticed this console log:
VisioDocument: version 0
Found xml parser severity error Extra content at the end of the document

Fridrich/Valek: any thoughts about this Visio error?
Comment 3 Julien Nabet 2015-01-04 13:37:33 UTC
Created attachment 111722 [details]
bt with debug symbols (4.4)

On pc Debian x86-64 with 4.4 sources updated some days ago, I could reproduce this.

I noticed this kind of loop:
#18 0x00002aaacbaf6ada in SwTxtFrm::Format (this=0x570be00) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:1813
#19 0x00002aaacb984ca1 in SwCntntFrm::MakeAll (this=0x570be00) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/layout/calcmove.cxx:1333
#20 0x00002aaacb97f54f in SwFrm::PrepareMake (this=0x570be00) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/layout/calcmove.cxx:277
#21 0x00002aaacb79a22a in SwFrm::Calc (this=0x570be00) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/inc/frame.hxx:1034
#22 0x00002aaacbaf0ced in SwTxtFrm::CalcFollow (this=0x55c9350, nTxtOfst=0) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:283
#23 0x00002aaacbaf2125 in SwTxtFrm::_AdjustFollow (this=0x55c9350, rLine=..., nOffset=0, nEnd=9, nMode=3 '\003')
    at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:594
#24 0x00002aaacbaf3e73 in SwTxtFrm::FormatAdjust (this=0x55c9350, rLine=..., rFrmBreak=..., nStrLen=9, bDummy=false)
    at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:1106
#25 0x00002aaacbaf570e in SwTxtFrm::_Format (this=0x55c9350, rLine=..., rInf=..., bAdjust=false)
    at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:1556
#26 0x00002aaacbaf5c5b in SwTxtFrm::_Format (this=0x55c9350, pPara=0x570b470) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:1666
#27 0x00002aaacbaf6ada in SwTxtFrm::Format (this=0x55c9350) at /home/julien/compile-libreoffice/libo_4_4/sw/source/core/text/frmform.cxx:1813
Comment 4 Julien Nabet 2015-01-04 13:39:15 UTC
Now the question is:
is the pb of loop in bt fixed or is the "Visio symptom" just hiding it?
Comment 5 Valek Filippov 2015-01-06 01:13:30 UTC
I do not see any Visio related stuff inside this file.

I'm not sure what filter normally reports if it got non-supported format, but overall it doesn't seem to be related to libvisio problem.

OLEToy is not that much advanced for "doc", so it wouldn't be easy for me to do a root cause analysis.
My suggestion is:
- open the original file in Word, delete first image, save under different name, re-open in LO
- if it opens normally, open original file in Word, remove all but the first image to verify that the problem somehow related to it
- otherwise repeat with the 2nd, 3rd etc images.

As an alternative I can try to dump images from the file to check if LO doesn't like any of them to be opened directly.
Comment 6 Maxim Britov 2015-01-06 08:36:28 UTC
ok, I did some experiments with this file.
I have loop on even two pages, but not for one page.

1. 
If I open it in mso 2007 and save as docx I can open it in LO.
If I open it then docx in LO and excport into doc, then I can open result doc in LO.

2.
If I remove all pictures, but leave only one I can open it (.doc) in LO.
If I remove all pages, but leave two I can't open it in LO.
If I leave two pages, but insert page break between, I can open it in LO.

I will attach doc and docx with two pages only.
It seems LO can't process doc files with continuous images without page breaks.

Sorry for my English :)
Comment 7 Maxim Britov 2015-01-06 08:38:51 UTC
Created attachment 111832 [details]
doc file with two images without pagebreak. LO can't open it.
Comment 8 Maxim Britov 2015-01-06 08:39:35 UTC
Created attachment 111833 [details]
docx file with two images without pagebreak. LO can open it.
Comment 9 Valek Filippov 2015-01-06 15:15:52 UTC
I guess we need LO Writer person now.
Maybe vmiklos?
Comment 10 Rostislav 'R.Yu.' Okulov 2015-01-12 08:54:59 UTC
git bisect start
# bad: [4a3091e95fa263d3e2dd81e56e83996f0bb12287] source-hash-2b5b04e1e62914bf0902dfd7943cdc44499c47a6
git bisect bad 4a3091e95fa263d3e2dd81e56e83996f0bb12287
# good: [812c4a492375ac47b3557fbb32f5637fc89d60d9] source-hash-dea4a3b9d7182700abeb4dc756a24a9e8dea8474
git bisect good 812c4a492375ac47b3557fbb32f5637fc89d60d9
# good: [5d0dfb8e62ae61a240f8313c594d4560e7c8e048] source-hash-0c6cd530de13f80795881f61064f1bf1dcc4ea81
git bisect good 5d0dfb8e62ae61a240f8313c594d4560e7c8e048
# bad: [7dfacd0b8bd828331d74c0f79de6e8924bc4e6a5] source-hash-f93ce4f7eb90093d0ea3115d0a1c614612676dbd
git bisect bad 7dfacd0b8bd828331d74c0f79de6e8924bc4e6a5
# good: [1a63057f6378db7c6b8af1171b7b140f7583f246] source-hash-59f84b4a2c082382767f12e0c7a06a3f0b52e721
git bisect good 1a63057f6378db7c6b8af1171b7b140f7583f246
# good: [2fdc98d4cfbffea5b33224bd2106aeb3b74b84a7] source-hash-d4a8fa7db0ed4faae00408fbda2352379774cfc0
git bisect good 2fdc98d4cfbffea5b33224bd2106aeb3b74b84a7
# bad: [3ff4aa6b7f147a98388d57e35368311034bceab6] source-hash-35e260c4a3e886c4177b232871f9f2775cd5c5f5
git bisect bad 3ff4aa6b7f147a98388d57e35368311034bceab6
# bad: [67b357d7f313d5ff960b6cf6646053b11e04ef7c] source-hash-bf640ba048704220292411e4f2bcc0d3c62caa32
git bisect bad 67b357d7f313d5ff960b6cf6646053b11e04ef7c
# good: [15089b6fcec017844895bab1dc9524cd904fe116] source-hash-be1bb7b1ccee28be616b89cc95e97d656e78bbe3
git bisect good 15089b6fcec017844895bab1dc9524cd904fe116
# bad: [2f84e5da87db237dc045e9a232a35513d3297a7e] source-hash-d718c1f65f850f7897b942c2e4415110132e51a5
git bisect bad 2f84e5da87db237dc045e9a232a35513d3297a7e
# good: [2025ac4bff9db6fbf8e1af8a66655bca5489a449] source-hash-a6c5f2ba6bca8ad95a3731e2770a1d216c9925a0
git bisect good 2025ac4bff9db6fbf8e1af8a66655bca5489a449
# first bad commit: [2f84e5da87db237dc045e9a232a35513d3297a7e] source-hash-d718c1f65f850f7897b942c2e4415110132e51a5


2f84e5da87db237dc045e9a232a35513d3297a7e is the first bad commit
commit 2f84e5da87db237dc045e9a232a35513d3297a7e
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Sat Oct 18 23:57:31 2014 +0000

    source-hash-d718c1f65f850f7897b942c2e4415110132e51a5
    
    commit d718c1f65f850f7897b942c2e4415110132e51a5
    Author:     Michael Meeks <michael.meeks@collabora.com>
    AuthorDate: Thu Aug 21 00:38:23 2014 +0100
    Commit:     Michael Meeks <michael.meeks@collabora.com>
    CommitDate: Thu Aug 21 00:39:45 2014 +0100
    
        coverity#1202729 - ensure we have exactly a one dimensional array.
    
        Change-Id: I6db8a2fb48ed7ce134a5c45c590c2ada0e19fc85

:100644 100644 980943f8812589672ba572e3171ed91598fac0b1 5ecc6f9fb3c904c3c0e4fcce5024456494098a92 M      ccache.log
:100644 100644 ff9f040d56d4d7d06cd423d107c7bd271294f912 309221fab327aba923b560fdcbb61d4e9b470894 M      commitmsg
:100644 100644 d984f9f6594a48010e77a4434ef34872a96e7654 5e8d89f7c27e5ceb0ecf91916842c03d3c4d918d M      make.log
:040000 040000 23ebe96ea7f200d930e6d06d8e10d103df8bc58d 19c554abb3f0e69850cfb3daae1391e883d63df5 M      opt
Comment 11 Matthew Francis 2015-01-12 12:44:36 UTC
The hang appears to start as of the below commit.

Adding Cc: to mstahl@redhat.com. Could you possibly have a look at this? Thanks.
(attachment 111704 [details] does not throw an exception before this commit, but does hang while loading after it)


commit 404f16e97f1c2fcd8f9a1297bdfa46cba970467e
Author: Michael Stahl <mstahl@redhat.com>
Date:   Tue Aug 19 18:11:42 2014 +0200

    sw: ww8: fix another ~SwIndexReg() assertion
    
    If the position is the same as the body text anchor position, don't
    delete the node.  Probably something should have inserted more nodes
    between StartApo() and StopApo().
    
    Change-Id: I41110a47d840e764f6d2a24e43bf6938b1282972
Comment 12 Michael Stahl (CIB) 2015-01-12 19:21:12 UTC
fixed on master
Comment 13 Commit Notification 2015-01-12 19:22:34 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e71668c4e642cc497206bfbe7191f64bddf31db0

sw: fdo#88005: fix check in SwWW8ImplReader::StopApo()

It will be available in 4.5.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Commit Notification 2015-01-14 10:04:56 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b25967266ec1dfa3662783ddc58a4b0c3c9ba953&h=libreoffice-4-3

sw: fdo#88005: fix check in SwWW8ImplReader::StopApo()

It will be available in 4.3.6.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 15 Commit Notification 2015-01-14 10:05:03 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-4":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=45722714be0abfb3ccb0b4dcbecaaaed98094343&h=libreoffice-4-4

sw: fdo#88005: fix check in SwWW8ImplReader::StopApo()

It will be available in 4.4.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2015-01-19 16:28:09 UTC
Michael Stahl committed a patch related to this issue.
It has been pushed to "libreoffice-4-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b4d4093c68ccc948379f13690e00f787d9f02c09&h=libreoffice-4-4-0

sw: fdo#88005: fix check in SwWW8ImplReader::StopApo()

It will be available in 4.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 17 Robinson Tryon (qubit) 2015-12-17 05:57:49 UTC
Migrating Whiteboard tags to Keywords: (bibisected, filter:doc)
[NinjaEdit]