Bug 89088 - FILEOPEN: Section with shape and text in specific DOCX lost
Summary: FILEOPEN: Section with shape and text in specific DOCX lost
Status: CLOSED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.4.0.0.beta1
Hardware: Other All
: medium major
Assignee: Miklos Vajna
URL:
Whiteboard: target:5.1.0 target:5.0.4
Keywords: bibisected, bisected, regression
Depends on: 89100
Blocks:
  Show dependency treegraph
 
Reported: 2015-02-03 16:28 UTC by Timur
Modified: 2016-10-25 19:19 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timur 2015-02-03 16:28:36 UTC
While reviewing attachment 97273 [details] in bug 77374, I noticed that content in the last section on the last page (starting from "Ihre kleine Schwester...", including ice cream picture) is lost on import starting from LO 4.4.0 beta1 and confirmed with LO 4.5.0 master.
It is a regression, as that section opens correctly up to 4.3.6.
Comment 1 Buovjaga 2015-02-07 09:40:02 UTC
Confirmed.

4.5 fails to open & says:
File format error found at Cannot extract an Any(void) to boolean!
SAXParseException: "[word/document.xml line 2]: unknown error" stream "word/document.xml", Line 2, Column 108512(row,col).

Win 7 Pro 64-bit Version: 4.5.0.0.alpha0+
Build ID: 99c00b090533da9818444be2831b8da0e713e5f9
TinderBox: Win-x86@62-TDF, Branch:MASTER, Time: 2015-02-04_06:38:53
Locale: fi_FI

Win 7 Pro 64-bit, LibO Version: 4.4.0.3
Build ID: de093506bcdc5fafd9023ee680b8c60e3e0645d7
Locale: fi_FI

Ubuntu 14.10 64-bit 
Version: 4.4.0.3
Build ID: 40m0(Build:3)
Locale: en_US
Comment 2 Rostislav 'R.Yu.' Okulov 2015-02-10 14:02:37 UTC
git bisect start
# bad: [4a3091e95fa263d3e2dd81e56e83996f0bb12287] source-hash-2b5b04e1e62914bf0902dfd7943cdc44499c47a6
git bisect bad 4a3091e95fa263d3e2dd81e56e83996f0bb12287
# good: [812c4a492375ac47b3557fbb32f5637fc89d60d9] source-hash-dea4a3b9d7182700abeb4dc756a24a9e8dea8474
git bisect good 812c4a492375ac47b3557fbb32f5637fc89d60d9
# good: [5d0dfb8e62ae61a240f8313c594d4560e7c8e048] source-hash-0c6cd530de13f80795881f61064f1bf1dcc4ea81
git bisect good 5d0dfb8e62ae61a240f8313c594d4560e7c8e048
# bad: [7dfacd0b8bd828331d74c0f79de6e8924bc4e6a5] source-hash-f93ce4f7eb90093d0ea3115d0a1c614612676dbd
git bisect bad 7dfacd0b8bd828331d74c0f79de6e8924bc4e6a5
# bad: [1a63057f6378db7c6b8af1171b7b140f7583f246] source-hash-59f84b4a2c082382767f12e0c7a06a3f0b52e721
git bisect bad 1a63057f6378db7c6b8af1171b7b140f7583f246
# bad: [3787e4f82e47eaf4fa454afdca671272e50f875b] source-hash-0e09134a4a4cbb0639fc586c560c6fb2765487be
git bisect bad 3787e4f82e47eaf4fa454afdca671272e50f875b
# bad: [5b2c61f6b34f03146c2d03da03a7b7f546ce56b8] source-hash-abf842e4b125b9f863ea4c2af17ad6ac7d82b15e
git bisect bad 5b2c61f6b34f03146c2d03da03a7b7f546ce56b8
# good: [1022c199a7d20dde7600f08007b5e2cac81e55f4] source-hash-df903c3e2084d8cc33e3935a1668b8b46e25201f
git bisect good 1022c199a7d20dde7600f08007b5e2cac81e55f4
# skip: [5ab97df01d7167dc7b472cf0f5b21fea4fccd232] source-hash-b651ed7a6700b560052b67102a65f06a498dd182
git bisect skip 5ab97df01d7167dc7b472cf0f5b21fea4fccd232
# good: [7e3ee4ad7f79565293c1ff9c20e099101435d3c1] source-hash-312ffe07bbef6b8dbc14ce38c0a726f69dd90946
git bisect good 7e3ee4ad7f79565293c1ff9c20e099101435d3c1
# bad: [a40d8c51092e2ab68f3c483b782e5eac0fdf5e3b] source-hash-f18a86759b20d13c660a6224fe26451cb64bd92d
git bisect bad a40d8c51092e2ab68f3c483b782e5eac0fdf5e3b
# first bad commit: [a40d8c51092e2ab68f3c483b782e5eac0fdf5e3b] source-hash-f18a86759b20d13c660a6224fe26451cb64bd92d
 a40d8c51092e2ab68f3c483b782e5eac0fdf5e3b is the first bad commit
commit a40d8c51092e2ab68f3c483b782e5eac0fdf5e3b
Author: Bjoern Michaelsen <bjoern.michaelsen@canonical.com>
Date:   Fri Oct 17 23:46:32 2014 +0000

    source-hash-f18a86759b20d13c660a6224fe26451cb64bd92d
    
    commit f18a86759b20d13c660a6224fe26451cb64bd92d
    Author:     David Tardon <dtardon@redhat.com>
    AuthorDate: Wed Jun 4 18:22:20 2014 +0200
    Commit:     David Tardon <dtardon@redhat.com>
    CommitDate: Wed Jun 4 18:26:08 2014 +0200
    
        add random unversioned test files for libcdr
    
        Change-Id: I9db735d7363e912edc1528c8964e665f1a4c8056

:100644 100644 388136c54578ecf5650c27786fccff57a358cc03 3e3fe18911b22ec443fbfb13c4e5ab388f886716 M      ccache.log
:100644 100644 9ca8f86943a4b2a52ff3419f520a360742472dbd 81edb49c526a5f3b8e5f42a95009f9223b823aa7 M      commitmsg
:100644 100644 8741fb45db0470457c4af5ed81971f338c178280 2348c478c7db324c7a85e14de16a9418051bb2a4 M      make.log
:040000 040000 0d68cdeee5c5ce83b259c3236701845784aab600 e4166a39ff8a0c551561031f2e73a741dbd41a67 M      opt
Comment 3 Matthew Francis 2015-02-19 14:48:53 UTC
This appears to have started as of the below commit.

Adding Cc: to vmiklos@collabora.co.uk; Could you possibly take a look at this? Thanks

    commit 866a4436d3cfac1ff42d7996250bf96fb703aeaa
    Author:     Miklos Vajna <vmiklos@collabora.co.uk>
    AuthorDate: Wed Jun 4 14:18:33 2014 +0200
    Commit:     Miklos Vajna <vmiklos@collabora.co.uk>
    CommitDate: Wed Jun 4 14:30:25 2014 +0200
    
        oox: handle textboxes in ShapeContextHandler::endFastElement()
    
        DOCX shape import normally works by oox creating the shape, then
        writerfilter handling the shape text. For drawingML shapes, having shape
        text, this a bit more complicated, as there are shape properties after
        the shape text as well.
    
        ShapeContextHandler::endFastElement() assumed that shape text is only
        possible on css.text.TextFrame shapes: also handle shapes having a
        TextBox as well.
    
        sw/qa/extras/ooxmlimport/data/mce-nested.docx is a reproducer for this
        problem (group shape missing), when TextBoxes are enabled by default in
        oox.
    
        Change-Id: I7a412b31965cf363da0b0c7fcc732741f2037542
Comment 4 Timur 2015-03-09 09:10:15 UTC
Can you please change the title to sth. more appropriate, like "FILEOPEN: Section with shape and text in DOCX lost".
Comment 5 Miklos Vajna 2015-10-30 21:24:57 UTC
Problem is that oox::shape::ShapeContextHandler::endFastElement() assumes all the XShape implementations provide a bool TextBox UNO property, but that's not the case. I'll take care of this.
Comment 6 Commit Notification 2015-11-02 08:15:56 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=4cae3689d4d78fabe6529c9df03c438b1e9d1611

tdf#89088 DOCX import: fix missing text due to throwing ShapeContextHandler

It will be available in 5.1.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 7 Commit Notification 2015-11-11 10:34:40 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "libreoffice-5-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=2c43db1ecfd9621f4eb6775a4f682d00987692f4&h=libreoffice-5-0

tdf#89088 DOCX import: fix missing text due to throwing ShapeContextHandler

It will be available in 5.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Robinson Tryon (qubit) 2015-12-17 08:46:02 UTC Comment hidden (obsolete)
Comment 9 Adolfo Jayme Barrientos 2015-12-19 07:48:47 UTC
backportRequest:4.4.7 declined. There are no more releases of the 4.4 branch planned.