Bug 116079 - Opening ODS file rise Incorrect Format exception. Opened well with OpenOffice 4.1.3
Summary: Opening ODS file rise Incorrect Format exception. Opened well with OpenOffice...
Status: RESOLVED NOTOURBUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
5.4.5.1 release
Hardware: All All
: highest major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: bibisected, bisected, regression
: 131059 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-02-28 09:52 UTC by Mariano
Modified: 2020-07-20 06:56 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Error opening REPORT_UNIT_TO_ODS.ods file (4.05 KB, image/png)
2018-02-28 09:53 UTC, Mariano
Details
File REPORT_UNIT_TO_ODS.ods well opened with OpenOffice 4.1.3 (55.66 KB, image/png)
2018-02-28 09:54 UTC, Mariano
Details
ODS file that causes the error (3.60 KB, application/vnd.oasis.opendocument.spreadsheet)
2018-02-28 09:54 UTC, Mariano
Details
bt with debug symbols (10.83 KB, text/plain)
2020-03-04 20:21 UTC, Julien Nabet
Details
file fixed (14.22 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-03-14 14:33 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mariano 2018-02-28 09:52:50 UTC
Description:
Attached ODS file when open with LibreOffice 5.4.5.1 raise Incorrect format error.
The same file opened with OpenOffice 4.1.3 do it well.

Regads,

Mariano

Steps to Reproduce:
1. Open attached file called: REPORT_UNIT_TO_ODS.ods
2.
3.

Actual Results:  
As see in attached file LibreOffice5.4.5.1.png

Expected Results:
As see in attached file OpenOffice4.1.3.png


Reproducible: Always


User Profile Reset: Yes



Additional Info:


User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0
Comment 1 Mariano 2018-02-28 09:53:51 UTC
Created attachment 140208 [details]
Error opening REPORT_UNIT_TO_ODS.ods file
Comment 2 Mariano 2018-02-28 09:54:25 UTC
Created attachment 140209 [details]
File REPORT_UNIT_TO_ODS.ods well opened with OpenOffice 4.1.3
Comment 3 Mariano 2018-02-28 09:54:47 UTC
Created attachment 140210 [details]
ODS file that causes the error
Comment 4 Xisco Faulí 2018-02-28 11:51:04 UTC
Regression introduced by:

author	Mohammed Abdul Azeem <azeemmysore@gmail.com>	2016-09-05 14:38:30 +0530
committer	Michael Meeks <michael.meeks@collabora.com>	2017-01-25 11:20:48 +0000
commit 8154953add163554c00935486a1cf5677cef2609 (patch)
tree d8e148e84aa1e164a2358827085f4d9240ce5e31
parent 657eea01046c7f39ee8ca4545241372177385946 (diff)
ScXMLTableRowCellContext implements fast interfaces:
Implementation of fast interfaces for contexts in path from
ScXMLImport::CreateFastContext to ScXMLTableRowCellContext.
FastParser is enabled and duplicates are avoided at all
possible places.
OOoXML filters still need those legacy paths we removed,
so I had to temporarily map them to fast elements, which
would increase their load time, but hopefully it should
help us in the long run.

Bisected with: bibisect-linux-64-5.4

Adding Cc: to Mohammed Abdul Azeem
Comment 5 QA Administrators 2019-05-13 02:50:40 UTC Comment hidden (obsolete)
Comment 6 Mariano 2019-07-04 07:14:47 UTC
This is still present.

About information:

Versión: 6.2.4.2 (x64)
Id. de compilación: 2412653d852ce75f65fbfa83fb7e7b669a126d64
Subprocs. CPU: 4; SO: Windows 10.0; Repres. IU: predet.; VCL: win; 
Configuración regional: es-ES (es_ES); Idioma de IU: es-ES
Calc: threaded
Comment 7 Xisco Faulí 2020-03-03 08:36:29 UTC
*** Bug 131059 has been marked as a duplicate of this bug. ***
Comment 8 Julien Nabet 2020-03-04 19:41:43 UTC
On pc Debian x86-64 with master sources updated today, I could reproduce this.

I noticed this on console logs:
warn:fwk.desktop:319071:319071:framework/source/services/desktop.cxx:1063: Desktop disposed before terminating it
warn:fwk.desktop:319071:319071:framework/source/services/desktop.cxx:179: Desktop not terminated before being destructed
warn:sc.filter:319101:319101:sc/source/filter/xml/xmlwrap.cxx:217: SAX parse exception caught while importing: com.sun.star.xml.sax.SAXParseException message: [file:///tmp/REPORT_UNIT_TO_ODS.ods line 2]: Extra content at the end of the document

    wrapped: void PublicId:  SystemId: file:///tmp/REPORT_UNIT_TO_ODS.ods LineNumber: 2 ColumnNumber: 1
warn:xmloff.core:319101:319101:xmloff/source/core/xmlimp.cxx:739: SvXMLImport::startElement: missing context for element style:default-style
warn:xmloff.core:319101:319101:xmloff/source/core/xmlimp.cxx:739: SvXMLImport::startElement: missing context for element style:paragraph-properties
warn:xmloff.core:319101:319101:xmloff/source/core/xmlimp.cxx:739: SvXMLImport::startElement: missing context for element office:forms
warn:xmloff.core:319101:319101:xmloff/source/core/xmlimp.cxx:739: SvXMLImport::startElement: missing context for element text:bookmark
Comment 9 Julien Nabet 2020-03-04 20:21:08 UTC
Created attachment 158396 [details]
bt with debug symbols

Here's the bt from throw xml::sax::SAXParseException
Comment 10 Julien Nabet 2020-03-04 20:37:39 UTC
I noticed that meta.xml contained just:
<?xml version="1.0" encoding="UTF-8"?>

and nothing else afterwards.
Comment 11 Julien Nabet 2020-03-04 20:59:30 UTC
With this patch, I can open the file:
diff --git a/sax/source/fastparser/fastparser.cxx b/sax/source/fastparser/fastparser.cxx
index f70995763c4c..d2cfb1417afa 100644
--- a/sax/source/fastparser/fastparser.cxx
+++ b/sax/source/fastparser/fastparser.cxx
@@ -1044,13 +1044,6 @@ void FastSaxParserImpl::parse()
         nRead = rEntity.maConverter.readAndConvert( seqOut, BUFFER_SIZE );
         if( nRead <= 0 )
         {
-            if( rEntity.mpParser != nullptr )
-            {
-                if( xmlParseChunk( rEntity.mpParser, reinterpret_cast<const char*>(seqOut.getConstArray()), 0, 1 ) != XML_ERR_OK )
-                    rEntity.throwException( mxDocumentLocator, true );
-                if (rEntity.hasException())
-                    rEntity.throwException(mxDocumentLocator, true);
-            }
             break;
         }

This part has been put with:
https://cgit.freedesktop.org/libreoffice/core/commit/?id=82d08580e368afbc9d73da3613845a36a89b0a8c
author	Luboš Luňák <l.lunak@collabora.com>	2014-11-14 17:13:41 +0100
committer	Luboš Luňák <l.lunak@collabora.com>	2014-11-14 17:20:00 +0100
commit 82d08580e368afbc9d73da3613845a36a89b0a8c (patch)
tree ef353fcfd8d7b427a0ecf2281eb7c0264f6fc9a6
parent 37800290245fd0462295a8bbaabd9d761929fa65 (diff)
switch saxparser from expat to libxml2

I don't know if it's ok or not.
nRead is sal_Int32 but it seems it can't be negative.

Also perhaps meta.xml with just encoding is ok but another xml file with just encoding is not ok
=> a bit stuck here.
Comment 12 Xisco Faulí 2020-03-05 09:06:37 UTC Comment hidden (obsolete)
Comment 13 Noel Grandin 2020-03-12 16:43:55 UTC
If Julien is correct in comment#10, then that is technically not a well-formed XML document.
Probably we used to accept it because we were being a little sloppy.
I suggest you go up the stack to just above the parser, and check if the data-stream contains exactly that sequence of chars, and then just return OK.
Comment 14 Julien Nabet 2020-03-12 20:48:47 UTC
(In reply to Noel Grandin from comment #13)
> If Julien is correct in comment#10, then that is technically not a
> well-formed XML document.
Here's what I did:
julien@debianamd:/tmp/jul$ mv REPORT_UNIT_TO_ODS.ods REPORT_UNIT_TO_ODS.zip
julien@debianamd:/tmp/jul$ unzip REPORT_UNIT_TO_ODS.zip 
Archive:  REPORT_UNIT_TO_ODS.zip
  inflating: content.xml             
  inflating: meta.xml                
  inflating: settings.xml            
  inflating: styles.xml              
  inflating: mimetype                
  inflating: META-INF/manifest.xml   
julien@debianamd:/tmp/jul$ cat meta.xml 
<?xml version="1.0" encoding="UTF-8"?>
julien@debianamd:/tmp/jul$ 

> Probably we used to accept it because we were being a little sloppy.
> I suggest you go up the stack to just above the parser, and check if the
> data-stream contains exactly that sequence of chars, and then just return OK.
Here's the bt from the method quoted in my previous comment:
#0  0x00007fffe4c69570 in sax_fastparser::FastSaxParserImpl::parse() (this=0x1b0a270) at sax/source/fastparser/fastparser.cxx:1044
#1  0x00007fffe4c68997 in sax_fastparser::FastSaxParserImpl::parseStream(com::sun::star::xml::sax::InputSource const&) (this=<optimized out>, rStructSource=...) at sax/source/fastparser/fastparser.cxx:869
#2  0x00007ffff1e0dc21 in SvXMLImport::parseStream(com::sun::star::xml::sax::InputSource const&) (this=0x1b08620, aInputSource=...) at xmloff/source/core/xmlimp.cxx:488
#3  0x00007fffe42e88c0 in ScXMLImportWrapper::ImportFromComponent(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&, com::sun::star::uno::Reference<com::sun::star::frame::XModel> const&, com::sun::star::uno::Reference<com::sun::star::xml::sax::XParser> const&, com::sun::star::xml::sax::InputSource&, rtl::OUString const&, rtl::OUString const&, rtl::OUString const&, com::sun::star::uno::Sequence<com::sun::star::uno::Any> const&, bool) (this=0x7fffffff1d38, xContext=..., xModel=..., xParser=..., aParserInput=..., sComponentName=..., sDocName=..., sOldDocName=..., aArgs=..., bMustBeSuccessfull=false)
    at sc/source/filter/xml/xmlwrap.cxx:189
#4  0x00007fffe42ea712 in ScXMLImportWrapper::Import(ImportFlags, ErrCode&) (this=0x7fffffff1d38, nMode=<optimized out>, rError=...) at sc/source/filter/xml/xmlwrap.cxx:432
#5  0x00007fffe43fcecc in ScDocShell::LoadXML(SfxMedium*, com::sun::star::uno::Reference<com::sun::star::embed::XStorage> const&) (this=0x1a5ef70, pLoadMedium=0x1a65fa0, xStor=...)
    at sc/source/ui/docshell/docsh.cxx:481
#6  0x00007fffe43fde38 in ScDocShell::Load(SfxMedium&) (this=0x1a5ef70, rMedium=...) at sc/source/ui/docshell/docsh.cxx:628
#7  0x00007ffff6923107 in SfxObjectShell::LoadOwnFormat(SfxMedium&) (this=0x1a5ef70, rMedium=...) at sfx2/source/doc/objstor.cxx:3033
#8  0x00007ffff692437b in SfxObjectShell::DoLoad(SfxMedium*) (this=0x1a5ef70, pMed=0x1a65fa0) at sfx2/source/doc/objstor.cxx:674
#9  0x00007ffff6954d2d in SfxBaseModel::load(com::sun::star::uno::Sequence<com::sun::star::beans::PropertyValue> const&) (this=0x1a62260, seqArguments=...) at sfx2/source/doc/sfxbasemodel.cxx:1879
#10 0x00007ffff69eddcd in (anonymous namespace)::SfxFrameLoader_Impl::load(com::sun::star::uno::Sequence<com::sun::star::beans::PropertyValue> const&, com::sun::star::uno::Reference<com::sun::star::frame::XFrame> const&) (this=<optimized out>, rArgs=..., _rTargetFrame=...) at sfx2/source/view/frmload.cxx:680
#11 0x00007fffe54432f0 in framework::LoadEnv::impl_loadContent() (this=0x195a468) at framework/source/loadenv/loadenv.cxx:1157

Where could I bypass and how?
Comment 15 Julien Nabet 2020-03-12 21:26:44 UTC
I gave a try with https://gerrit.libreoffice.org/c/core/+/90446
Comment 16 Mike Kaganski 2020-03-12 21:27:30 UTC
(In reply to Noel Grandin from comment #13)
> If Julien is correct in comment#10, then that is technically not a
> well-formed XML document.

Just removing meta.xml from the document makes it open OK.
And yes, having no elements in the XML after prolog (here consisting of declaration) makes the XML not well-formed [1].

[1] https://www.w3.org/TR/2008/REC-xml-20081126/#sec-well-formed
Comment 17 Mike Kaganski 2020-03-12 21:31:36 UTC
Personally I don't see why:

1. this has such a high importance. Isn't it more important to file bugs to whatever tries to generate invalid documents?
2. this should be fixed at all. Why accept broken data - unless it's really impossible to fix the root (prevent these documents from appearing)?
Comment 18 Julien Nabet 2020-03-12 21:41:32 UTC
(In reply to Mike Kaganski from comment #16)
> (In reply to Noel Grandin from comment #13)
> > If Julien is correct in comment#10, then that is technically not a
> > well-formed XML document.
> 
> Just removing meta.xml from the document makes it open OK.
> And yes, having no elements in the XML after prolog (here consisting of
> declaration) makes the XML not well-formed [1].
> 
> [1] https://www.w3.org/TR/2008/REC-xml-20081126/#sec-well-formed

Indeed.
In this case, perhaps should we change the patch to replace the return part  by a SAL_WARN ?
Comment 19 Julien Nabet 2020-03-14 14:33:34 UTC
Created attachment 158677 [details]
file fixed

Here what I did:
- applied locally my abandoned patch to be able to open the file
- opened the file
- saved it in another name
- removed the abandoned patch from local sources
- opened the new file
Comment 20 Eike Rathke 2020-05-28 15:07:18 UTC
Maybe just a personal opinion, but I wouldn't do anything about this. The document is broken, generated by a broken implementation. Unfortunately we don't know the generator because that information is supposed to be stored in the <meta:generator> element, which the generator chose to not include..

To the original bug reporter:
The file name REPORT_UNIT_TO_ODS.ods suggests some ODF/.ods generating report tool, what is it? I'd recommend to file a bug with that if it's still in use, otherwise forget about it.
The document fixed by Julien (thanks!) is now attached, I think we can close this bug as notabug/wontfix/notourbug/whatever.
Comment 21 Xisco Faulí 2020-07-16 15:02:23 UTC
Hello Mariano,
Could you please explain how the document was generated ?
Comment 22 Mariano 2020-07-20 06:56:53 UTC
Hi Xisco, the document is generated via Jasperserver 7.1.0 Community Version, exporting document.

Regards,

Mariano