Bug 123163 - CRASH when FILEOPEN of DOCX from scan and Nuance PDF Converter that also can't be open with MSO
Summary: CRASH when FILEOPEN of DOCX from scan and Nuance PDF Converter that also can'...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: All All
: high major
Assignee: Caolán McNamara
URL:
Whiteboard: target:6.3.0 target:6.2.1 target:6.1.6
Keywords: filter:docx
Depends on:
Blocks: DOCX-Opening
  Show dependency treegraph
 
Reported: 2019-02-04 17:41 UTC by Mike Kupfer
Modified: 2019-02-11 00:01 UTC (History)
5 users (show)

See Also:
Crash report or crash signature: ["SwPageFrame::RemoveDrawObjFromPage(SwAnchoredObject&)","SwPageFrame::RemoveDrawObjFromPage(SwAnchoredObject &)"]


Attachments
test case (333.27 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2019-02-06 02:45 UTC, Mike Kupfer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kupfer 2019-02-04 17:41:12 UTC
Description:
I have a document that was scanned on an Epson scanner. The resulting PDF was then converted to a .docx file by Nuance PDF Converter. When I open the .docx file using LO, there is a pause, then LO crashes.

This workflow has worked countless times with other documents, so the problem is something peculiar to this document. It may be that the .docx file is corrupted, but still, LO shouldn't crash.

Steps to Reproduce:
1. Open the file. I have gotten crashes using all of the following methods: double-click on file in file browser (Thunar), "libreoffice <file>" from shell, and clicking on the document in the "recent documents" list.


Actual Results:
LO starts to display the document, then there is a pause, then LO crashes.

Expected Results:
LO displays the document, lets me edit it or save it as text.


Reproducible: Always


User Profile Reset: Yes


OpenGL enabled: Yes

Additional Info:
See crashreport.libreoffice.org/stats/crash_details/b6ff9379-6fa8-4f24-9a34-4dfef3c0f5cd.

I can provide the document if you need it.  It's a scan of copyrighted material (for my files), so I'd rather not post it in a public location.

I've reproduced the crash on 

* Debian Stretch on baremetal:
  - the bundled LO 5.2.7+patches
alto$ glxinfo | grep OpenGL
OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD RS780 (DRM 2.49.0 / 4.9.0-8-amd64, LLVM 3.9.1)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 13.0.6
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 13.0.6
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 13.0.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

* Debian Stretch in a VirtualBox guest
  - host is running VBox 6.0.4, guest is has the 5.2.16 Guest Additions
  - LO 6.1.4 downloaded from a libreoffice.org mirror
stretch$ glxinfo | grep OpenGL
OpenGL vendor string: Humper
OpenGL renderer string: Chromium
OpenGL version string: 2.1 Chromium 1.9
OpenGL shading language version string: 1.30
OpenGL extensions:

* Debian Buster in a VirtualBox guest
  - same host
  - bundled LO 6.1.5 RC1

I did not get a crash on Debian Jessie running in a VirtualBox guest (bundled LO 4.3.3+patches).

All the VirtualBox guests have 3D acceleration enabled.
Comment 1 Julien Nabet 2019-02-04 20:00:34 UTC Comment hidden (obsolete)
Comment 2 Mike Kupfer 2019-02-05 00:58:19 UTC
(In reply to Julien Nabet from comment #1)
> Would it be possible you attach the file?
> Of course, have in mind to remove any private/confidential part from it (if
> you have MsOffice since LO crashes).

I'll see what I can do. It's possible that after editing, the document will get saved in a form that no longer causes a crash. Just copying the document using LO 4.3.3 (using either File>Save As or File>Save a Copy) results in a copy that does not cause a crash.
Comment 3 Xisco Faulí 2019-02-05 16:28:26 UTC Comment hidden (obsolete)
Comment 4 Mike Kupfer 2019-02-06 02:45:57 UTC
Created attachment 148938 [details]
test case

I've attached a test case with the copyrighted material removed.
Comment 5 Dieter 2019-02-06 08:38:10 UTC
I confirm the crash with

Version: 6.3.0.0.alpha0+ (x64)
Build ID: 411f3a050ac2be598019d512f8ccfe041080c28f
CPU threads: 4; OS: Windows 10.0; UI render: default; VCL: win; 
TinderBox: Win-x86_64@42, Branch:master, Time: 2019-01-14_03:17:11
Locale: en-US (de_DE); UI-Language: en-US
Calc: threaded

and

Version: 4.4.7.2
Build-ID: f3153a8b245191196a4b6b9abd1d0da16eead600
Gebietsschema: de_DE
Comment 6 Xisco Faulí 2019-02-06 10:54:24 UTC
Also reproduced in

Version 4.1.0.0.alpha0+ (Build ID: efca6f15609322f62a35619619a6d5fe5c9bd5a)

LibreOffice 3.3.0 
OOO330m19 (Build:6)
tag libreoffice-3.3.0.4
Comment 7 Xisco Faulí 2019-02-06 11:35:41 UTC
if I remove the code from line 1100 to 1116 in https://opengrok.libreoffice.org/xref/core/sw/source/core/layout/flylay.cxx?r=f2cd9c0c#1100, it no longer crashes.
It seems like m_pSortedObjs is used after destroyed ??

@Caolán, @Noel, what do you think ?
Comment 8 Noel Grandin 2019-02-06 11:39:20 UTC
@Xisco, if that is case, put a breakpoint on the destructor of SwSortedObjs i.e. on SwSortedObjs::~SwSortedObjs, and see when it gets destroyed, compared to when it is used
Comment 9 Caolán McNamara 2019-02-06 12:35:40 UTC
https://gerrit.libreoffice.org/#/c/67452/ for me at least makes it not crash (needs https://gerrit.libreoffice.org/#/c/67450/ for additional assert under dbgutil)
Comment 10 Commit Notification 2019-02-06 13:54:53 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/a96014f1365e411d700a7119dca63cdbb51931dc%5E%21

Resolves: tdf#123163 avoid null deref

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Caolán McNamara 2019-02-06 13:58:26 UTC
that "works for me", so I'll claim that its fixed
Comment 12 Commit Notification 2019-02-06 14:31:19 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/+/77508eee7687462e991c4fac3ba6ae8ec2650987%5E%21

fix assert seen on opening attachment from tdf#123163

It will be available in 6.3.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Luke 2019-02-06 14:32:51 UTC
For historical purposes, note that this test case is a corrupt OOXML file. Word reports unrecoverable errors in: word/document.xml

If any regression are reported, this fix should be reverted immediately.
Comment 14 Xisco Faulí 2019-02-06 14:43:51 UTC
(In reply to Luke from comment #13)
> For historical purposes, note that this test case is a corrupt OOXML file.
> Word reports unrecoverable errors in: word/document.xml
> 
> If any regression are reported, this fix should be reverted immediately.

Yep, I didn't check that but you're right, MSO 2010 can't open it. Anyway, it shouldn't crash...
There are almost 600 crashes in https://crashreport.libreoffice.org/stats/signature/SwPageFrame::RemoveDrawObjFromPage(SwAnchoredObject%20&)
Comment 15 Commit Notification 2019-02-07 09:53:47 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-6-2":

https://git.libreoffice.org/core/+/5a5c54b755f09984ecc38bf1f800d185456128b3%5E%21

fix assert seen on opening attachment from tdf#123163

It will be available in 6.2.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 16 Commit Notification 2019-02-07 09:53:57 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-6-1":

https://git.libreoffice.org/core/+/45510866fedac63014a3120a1130dbea9fd803ee%5E%21

fix assert seen on opening attachment from tdf#123163

It will be available in 6.1.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 17 Commit Notification 2019-02-07 10:01:12 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-6-2":

https://git.libreoffice.org/core/+/e9534eddcec78e5f6f551c848fa18f07a298ccfa%5E%21

Resolves: tdf#123163 avoid null deref

It will be available in 6.2.1.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 18 Commit Notification 2019-02-07 10:01:19 UTC
Caolán McNamara committed a patch related to this issue.
It has been pushed to "libreoffice-6-1":

https://git.libreoffice.org/core/+/e6e8f02407ea781a3634fd2669ad6467e9587db4%5E%21

Resolves: tdf#123163 avoid null deref

It will be available in 6.1.6.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 19 Xisco Faulí 2019-02-07 13:59:43 UTC
Verified in

Version: 6.3.0.0.alpha0+
Build ID: 6287a4f0d18df7f195d1f14b7c24536317463a23
CPU threads: 4; OS: Linux 4.15; UI render: default; VCL: gtk3; 
Locale: ca-ES (ca_ES.UTF-8); UI-Language: en-US
Calc: threaded

@Caolán, thanks for fixing this!!
Comment 20 Mike Kupfer 2019-02-11 00:01:08 UTC
I've verified the fix against my original document (using the tinderbox build with git cb12ed1f from master), and it does work now.  Thanks!