Bug 75972 - FILEOPEN: SAXParseException on one .DOCX (summary in comment 7)
Summary: FILEOPEN: SAXParseException on one .DOCX (summary in comment 7)
Status: RESOLVED INSUFFICIENTDATA
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.2.1.1 release
Hardware: All All
: medium normal
Assignee: Not Assigned
URL:
Whiteboard: summaryUpdate interoperability
Keywords: filter:docx, haveBacktrace
Depends on:
Blocks: DOCX-SAXParse DOCX-Opening
  Show dependency treegraph
 
Reported: 2014-03-10 09:36 UTC by Orbel
Modified: 2020-01-03 16:46 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
backtrace with symbols from SIGABRT (15.52 KB, text/plain)
2014-03-10 17:26 UTC, Terrence Enger
Details
compressed DOCX and PDF (8.22 MB, application/x-7z-compressed)
2016-01-11 10:13 UTC, Mike Kaganski
Details
massaged terminal output (5.71 KB, text/plain)
2019-06-05 11:39 UTC, Terrence Enger
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Orbel 2014-03-10 09:36:54 UTC
The attached DOCX file does not open in LibreOffice and immediately offers recovery, which still does not resolve the issue.
Comment 1 Orbel 2014-03-10 09:42:22 UTC
Since the attachment is too big, please follow the link below to download the issue DOCX file:
https://drive.google.com/file/d/0B6ccfQG2Kep-Nzk5QW1sc3kxNkk/edit?usp=sharing
Comment 2 Urmas 2014-03-10 10:44:13 UTC
The document appears corrupted. Does it open at all?
Comment 3 Terrence Enger 2014-03-10 17:26:48 UTC
Created attachment 95522 [details]
backtrace with symbols from SIGABRT

With master commit 806f4d8, fetched 2014-03-04, configured as:
    --enable-option-checking=fatal
    --enable-dbgutil
    --enable-crashdump
    --without-system-postgresql
    --without-myspell-dicts
    --with-extra-buildid
    --without-doxygen
    --with-external-tar=/home/terry/lo_hacking/git/src
built and running on debian-wheezy 64-bit, I have managed to provoke a
SIGABRT.  The interesting part of the terminal output is:
    /usr/include/c++/4.7/debug/vector:366:error: attempt to access an element 
        in an empty container.

    Objects involved in the operation:
    sequence "this" @ 0x0x39300c8 {
      type = NSt7__debug6vectorIN5boost10shared_ptrINS0_IiSaIiEEEEESaIS5_EEE;
    }
    Application Error


    Fatal exception: Signal 6

This is another case of an assertion raised by a STL debug container.
It may not be exactly the crash originally reported, but hopefully it
happened earlier and more informatively.
Comment 4 QA Administrators 2014-10-05 23:05:41 UTC Comment hidden (obsolete)
Comment 5 Robinson Tryon (qubit) 2014-10-10 04:09:30 UTC
(In reply to Orbel from comment #0)
> The attached DOCX file does not open in LibreOffice and immediately offers
> recovery, which still does not resolve the issue.
> ...
> https://drive.google.com/file/d/0B6ccfQG2Kep-Nzk5QW1sc3kxNkk/edit?usp=sharing

Trying to open the given docx file crashes LibreOffice 4.3.2.2 on Ubuntu 14.04. It appears that there is some doubt about whether this file is actually validly-formatted, so I'll tentatively mark it as NEW.
Comment 6 QA Administrators 2015-10-14 19:56:29 UTC Comment hidden (obsolete)
Comment 7 Terrence Enger 2015-11-18 03:07:49 UTC
Summary
-------

Over time, people have reported various problems opening the file
linked from comment 1 ...

  - description : does not open
  - c#2         : document appears corrupted
  - c#3         : SIGABRT, access an element in an empty container
  - c#5         : crashes. doubt about validity of file
  - c#7 (here!) : SAXParseException

New comment
-----------

In daily dbgutil repository version 2015-11-13, upon attempt to open
the file, LO displays a message box (newlines added) ...

    File format error found at unsatisfied query for interface of type
        com.sun.star.lan.XComponent!
    SAXParseException: '[word/endnotes.xml line 2]: unknown error',
        Stream 'word/endnotes.xml', Line 2, Column 169834
    SAXParseException: '[word/document.xml line 2]: unknown error',
        Stream 'word/document.xml', Line 2, Column 139175(row,col).


officeotron reports one error in "Checking OPC Package" (newlines
added):

    Entry with MIME type
    "application/vnd.openxmlformats-package.core-properties+xml"
    has unrecognized relationship type
    "http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties"
    (see ISO/IEC 29500-1:2008, Clause 15.2.12.1)


I am changing bug summary from
    The attached DOCX file does not open in LibreOffice
to
    FILEOPEN: SAXParseException on one .DOCX (summary in comment 07)
and adding whiteboard summaryUpdate.
Comment 8 Mike Kaganski 2016-01-11 10:13:03 UTC
Created attachment 121849 [details]
compressed DOCX and PDF

The file opens with Word. It contains 5013 pages. Word works with document very slowly; repagination takes ~5 min. The archive also contains PDF generated with Word.

On my system, it opens with Version: 5.0.4.2 (x64)
Build ID: 2b9802c1994aa0b7dc6079e128979269cf95bc78
Locale: ru-RU (ru_RU)
(opening takes ~12 mins; after that it shows first pages), and immidiately crashes with SEH Exception: ACCESS VIOLATION.
Screenshot is also in attachment.
Comment 9 QA Administrators 2018-06-18 02:42:44 UTC Comment hidden (obsolete)
Comment 10 Julien Nabet 2019-05-15 11:56:55 UTC
On Win 10 with master sources updated yesterday, I don't reproduce the crash.
However, it's quite long to open (several minutes).

Any update with recent LO version? (eg 6.2.3)
Comment 11 Terrence Enger 2019-06-05 11:39:12 UTC
Created attachment 151934 [details]
massaged terminal output

I do not see the bug in a local build of of commit 74288f5a from
2019-04-12, configured with --enable-debug, built and running on
debain-buster.

However, just opening the test file and quitting LibreOffice wrote
47796 lines to the terminal.  Attachment is the result of:

    < $outfile sort | uniq --count | sort --numeric --reverse
Comment 12 QA Administrators 2019-12-03 03:32:48 UTC Comment hidden (obsolete)
Comment 13 QA Administrators 2020-01-03 03:24:54 UTC
Dear Orbel,

Please read this message in its entirety before proceeding.

Your bug report is being closed as INSUFFICIENTDATA due to inactivity and
a lack of information which is needed in order to accurately
reproduce and confirm the problem. We encourage you to retest
your bug against the latest release. If the issue is still
present in the latest stable release, we need the following
information (please ignore any that you've already provided):

a) Provide details of your system including your operating
   system and the latest version of LibreOffice that you have
   confirmed the bug to be present

b) Provide easy to reproduce steps – the simpler the better

c) Provide any test case(s) which will help us confirm the problem

d) Provide screenshots of the problem if you think it might help

e) Read all comments and provide any requested information

Once all of this is done, please set the bug back to UNCONFIRMED
and we will attempt to reproduce the issue. Please do not:

a) respond via email 

b) update the version field in the bug or any of the other details
   on the top section of our bug tracker

Warm Regards,
QA Team

MassPing-NeedInfo-FollowUp