Bug 62336 - Command line conversion to HTML fails + crash on some platforms for certain document(s)
Summary: Command line conversion to HTML fails + crash on some platforms for certain d...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
4.0.1.2 release
Hardware: x86-64 (AMD64) All
: high major
Assignee: Not Assigned
URL:
Whiteboard: target:4.1.0 target:4.0.4
Keywords: regression
Depends on:
Blocks:
 
Reported: 2013-03-14 11:14 UTC by Sorinello
Modified: 2013-04-30 16:40 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
file which works on 3.6.2.2 but does not work on 4.0.1.2 (505.82 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-03-14 11:14 UTC, Sorinello
Details
minimalish test case for out of bound nRow (20.86 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-04-27 10:23 UTC, Michael Meeks
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sorinello 2013-03-14 11:14:02 UTC
Created attachment 76522 [details]
file which works on 3.6.2.2 but does not work on 4.0.1.2

I have a set of documents of diferent formats (all from Microsoft)

When I convert them to html files, I use the following command:

libreoffice --headless --convert-to htm:HTML --outdir /home/user/LOfiles/ myFile-DOCX.docx

While this command works for all my documents using LO 3.6.22, it does not work on all files when using LO 4.0.1.2.

By not working I mean that I get the following message in my console:

user@linux:~/LOfiles$ libreoffice4.0 --headless --convert-to htm:HTML  ~/LOfiles/004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.docx 
convert /user/xwiki/LOfiles/004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.docx -> /home/user/LOfiles/004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.htm using HTML

The thing is, that I don't get any file out of this. For other documents it does work, but this is a particular file where it doesn't.

This file is a valid File, which I am able to open from LO Writer and even able to save it as HTML successfully from the Writer.

I have attached the file. So this works on 3.6.2.2 but does not work with 4.0.1.2.

If you need more details, please tell me.
Comment 1 Jorendc 2013-04-16 17:01:46 UTC
Hi,

Thanks for reporting.

I can reproduce this behavior using Linux Mint 14 x64 with LibreOffice Version: 4.1.0.0.alpha0+ Build ID: 4b73d334a9c5d8ae1fe16b2cc04100b9f333595 


joren@joren-System-Product-Name ~/core/install/program $ ./soffice --headless --convert-to htm:HMTL --outdir ~/Documenten/ ~/Downloads/004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.docx 
Warning: failed to launch javaldx - java may not function correctly
warn:legacy.osl:6699:1:oox/source/docprop/docprophandler.cxx:320: For now unexpected tags are ignored!
warn:legacy.osl:6699:1:oox/source/docprop/docprophandler.cxx:320: For now unexpected tags are ignored!
warn:legacy.osl:6699:1:oox/source/docprop/docprophandler.cxx:320: For now unexpected tags are ignored!
warn:legacy.osl:6699:1:oox/source/docprop/docprophandler.cxx:320: For now unexpected tags are ignored!
warn:legacy.osl:6699:1:oox/source/docprop/docprophandler.cxx:320: For now unexpected tags are ignored!
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: HorizontalBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: VerticalBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: BottomBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: LeftBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: RightBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: TopBorder. Message: 
warn:sw.uno:6699:1:sw/source/core/unocore/unotext.cxx:2289: Exception when setting property: IsWidthRelative. Message: relative width cannot be switched on with this property
warn:legacy.osl:6699:1:oox/source/helper/storagebase.cxx:71: StorageBase::StorageBase - missing base input stream
convert /home/joren/Downloads/004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.docx -> /home/joren/Documenten//004-FREE-French-Sanskrit-Tables-NoHeader-250Pages-DOCX.htm using HMTL
Error: Please reverify input parameters...
Exited with code '0'

Joel (another QA expert) succeed to open the document. So maybe it has something to do with the Java warning (Warning: failed to launch javaldx - java may not function correctly). Therefore I'm going to rebuild now. I'll leave this bug as UNCONFIRMED for now.

Kind regards,
Joren
Comment 2 Jorendc 2013-04-16 22:28:19 UTC
@Joel: Is it correct you managed to open the attached document correctly? I can't.
Can you give it a try when you have some time?

Thanks!
Joren
Comment 3 Urmas 2013-04-17 08:54:18 UTC
This document causes LO in Windows to terminate with "std::bad_alloc" exception:

msvcr100.dll!__CxxThrowException@8()
msvcr100.dll!operator new()
vcllo.dll!OutputDevice::GetSysFontData()
??????????
vcllo.dll!OutputDevice::GetEllipsisString()
vcllo.dll!GetLocalizedChar()
vcllo.dll!OutputDevice::ImplGlyphFallbackLayout()
vcllo.dll!OutputDevice::ImplLayout()
vcllo.dll!OutputDevice::GetTextArray()
vcllo.dll!OutputDevice::GetTextWidth()
swlo.dll!SwFmtINetFmt::PutValue()
swlo.dll!SwFmtINetFmt::PutValue()
swlo.dll!SwFmtINetFmt::PutValue()
swlo.dll!SwFmtINetFmt::PutValue()
swlo.dll!cppu::WeakImplHelper2<com::sun::star::linguistic2::XLinguServiceEventListener,com::sun::star::frame::XTerminateListener>::getImplementationId()
swlo.dll!cppu::WeakImplHelper2<com::sun::star::linguistic2::XLinguServiceEventListener,com::sun::star::frame::XTerminateListener>::operator=()
 	swlo.dll!SwTxtNode::GetMinMaxSize()
Comment 4 Joel Madero 2013-04-22 18:40:05 UTC
Very strange, I can open the document no problem in Linux.

I think there is enough info to mark this as NEW, for the crash, I suggest NOT opening a new bug quite yet as it might be the true cause of this bug. 

Because of the crash + inability to convert to HTML I am marking this as a Major bug. 

Joren - feel free to change if you disagree, additional info may be requested by a developer in which case we'll further investigate.

Marking as:
New (confirmed)

Major - data loss, inability to use a feature (convert to HTML) + crasher on some platforms (Windows, perhaps OSX as well)

High - default seems appropriate
Comment 5 Michael Meeks 2013-04-27 10:09:31 UTC
I get a SEGV loading this document vs. 4.1 / master which is interesting; the root cause is:

warn:legacy.osl:8925:1:sw/source/filter/writer/wrtswtbl.cxx:567: missing row

which leads to:

==7454== Invalid read of size 4
==7454==    at 0xFF1611B: SwWriteTable::FillTableRowsCols(long, unsigned short, unsigned long, unsigned short, long, unsigned long, SwTableLines const&, SvxBrushItem const*, unsigned short, unsigned short) (wrtswtbl.cxx:700)
==7454==    by 0xFF169D2: SwWriteTable::SwWriteTable(SwTableLines const&, long, unsigned long, bool, unsigned short, unsigned short, unsigned short, unsigned long) (wrtswtbl.cxx:746)
==7454==    by 0xFEF6684: SwHTMLWrtTable::SwHTMLWrtTable(SwTableLines const&, long, unsigned long, unsigned char, unsigned short, unsigned short, unsigned short) (htmltabw.cxx:101)
==7454==    by 0xFEF7DDE: OutHTML_SwTblNode(Writer&, SwTableNode&, SwFrmFmt const*, String const*, unsigned char) (htmltabw.cxx:1150)
==7454==    by 0xFF114AB: SwHTMLWriter::Out_SwDoc(SwPaM*) (wrthtml.cxx:762)
==7454==    by 0xFF12381: SwHTMLWriter::WriteStream() (wrthtml.cxx:363)
==7454==    by 0xFF13B4A: Writer::Write(SwPaM&, SvStream&, String const*) (writer.cxx:273)
==7454==    by 0xFF13BCF: Writer::Write(SwPaM&, SfxMedium&, String const*) (writer.cxx:284)

I imagine nRow is not good there :-)
Comment 6 Michael Meeks 2013-04-27 10:23:35 UTC
Created attachment 78552 [details]
minimalish test case for out of bound nRow
Comment 7 Michael Meeks 2013-04-29 16:50:13 UTC
This is down to some truly horrible design / coupling between:

SwWriteTable::CollectTableRowsCols and ::FillTableRowsCols

Which have intricate cut/paste calculation magic of an truly horrible kind, it's really just -too- awful :-)

Apparently this bug https://issues.apache.org/ooo/show_bug.cgi?id=60390 is related to the underlying problem - which is some global settings that affect how these loops iterate.
Comment 8 Michael Meeks 2013-04-29 17:08:45 UTC
Wow this particular code is nasty, poorly designed and needs re-factoring :-)
Anyhow - 'fixed' by which I mean it doesn't crash, and appears to work now in master, pushed to gerrit for 4-0.
Comment 9 Commit Notification 2013-04-29 17:12:49 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=61dc0d2c9d79fd9ce32cd9591fad4daead0ebade

fdo#62336 - fix horribly coupled table rendering code to not crash.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2013-04-29 17:36:02 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "libreoffice-4-0":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=0a115d48057867c60bfcd527e90433b2dca1f28a&h=libreoffice-4-0

fdo#62336 - fix horribly coupled table rendering code to not crash.


It will be available in LibreOffice 4.0.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 11 Commit Notification 2013-04-30 16:40:40 UTC
Michael Meeks committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=06a8ebc878ff9bcab26556d5b5a46532e232d416

fdo#62336 - unit test for conversion failure.



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.