Downloaded a document from Dutch government site /home/cono/Documenten/DATA/Nou&Off/Projecten/VluchtelingenWerk Nederland/MigratieLibreOffice/TestDocumenten/port_'Onafhankelijke_casemanager_in_de_vreemdelingenketen._Perspectieven_vanuit_het_buitenland'.docx Open that in 5.0.0beta3 File format error found at SAXParseException: '[word/document.xml line2]: unknown error', Stream 'word/document.xml', Line 2, Column 21949(row,col)
File opens fine in 4.4.4.2 (I had a somewhat similar problem with another file in beta1 or so, but that now opens fine in beta3.)
Could you provide the link so we can download the doc?
(In reply to Julien Nabet from comment #2) > Could you provide the link so we can download the doc? Sure (I posted the wrong link, sorry) http://www.tweedekamer.nl/downloads/document?id=e97ba5c2-0482-43e6-bdb2-0455cb6c5d94&title=Reactie%20op%20het%20rapport%20%27Onafhankelijke%20casemanager%20in%20de%20vreemdelingenketen.%20Perspectieven%20vanuit%20het%20buitenland%27.docx
No problem opening with v5.0.0.0 b3 under ubuntu 14.04 x64 and mint 17.1 x64.
On pc Debian x86-64 with master sources updated today, I don't reproduce this. I tried too with LO Debian package 4.4.4.1. Cor: did you try with a new LO profile? Have you got accessibility enabled? If yes, could you disable it and give a new try?
Created attachment 116723 [details] better test file hmm, something strange with the link and the file that I have. So I attached the file that does give the problem for me (also in 510master with clean user profile)
Confirmed with v5.0.0.0 b3 under mint 17.1 x64. That last file doesn't open correctly.
I can see that it opens OK in Version: 4.4.0.0.beta1 Build ID: 9af3d21234aa89dac653c0bd76648188cdeb683e Locale: nl_NL and bad in Version: 4.4.0.0.alpha2 Build ID: 24f0a5815f581dd9a7f09d30213a379edee6e9ac Bibisecting however is not possible on 32 bits Ubuntu, so I rely on someone else to do that..
(In reply to Cor Nouws from comment #8) > I can see that it opens OK in Version: 4.4.0.0.beta1 Ignore! wrong issue for that comment!
Hello @ll I can confirm it with LO Version: 5.0.0.3 Build-ID: f79b5ba13f5e6cbad23f8038060e556217e66632 Gebietsschema: de-DE (de_DE.UTF-8) (parallel installed, following the instructions from https://wiki.documentfoundation.org/Installing_in_parallel/Linux) with installed Germanophone lang- as well as helppack under Debian Testing AMD64 ... :( But I found also #89100 ... Could it be, that this bug is a duplicate? HTH Thomas.
Hi Thomas, (In reply to thackert from comment #10) > ... :( But I found also #89100 ... Could it be, that this bug is a duplicate? Could... but there may be more problems that result in the same error message for the user, so it must be check by the developers. Thanks for mentioning the issue!
bisect result (using the "bibisect-50max" repository): 8b400e2c6b64ea88b911187a21de7090ee49f305 is the first bad commit commit 8b400e2c6b64ea88b911187a21de7090ee49f305 Author: Matthew Francis <mjay.francis@gmail.com> Date: Wed May 27 18:16:49 2015 +0800 source-hash-ebf767eeb2a169ba533e1b2ffccf16f41d95df35 commit ebf767eeb2a169ba533e1b2ffccf16f41d95df35 Author: Michael Stahl <mstahl@redhat.com> AuthorDate: Thu Jan 22 12:50:07 2015 +0100 Commit: Michael Stahl <mstahl@redhat.com> CommitDate: Thu Jan 22 13:58:10 2015 +0100 writerfilter: DOCX import: better error handling than "catch (...) {}" If there is a SAXParseException, OOXMLDocumentImpl::resolve() should not ignore it, because if it occurs in a substream some end tag handlers may not have been run and the DomainMapper may be in an inconsistent state, so continuing to parse the outer document is probably not a good idea. Also add some exception mangling so sfx2 can present a useful error dialog. Change-Id: I169ba6db25f2ae264af08a64edf76a6bf6757f85 :040000 040000 304d902d5bb07301189acae3bc2d1840d5ef1663 e47dc06f6dbb1f265fc8835f7eb9ac016f2afcc1 M opt --- $ git bisect log # bad: [dda106fd616b7c0b8dc2370f6f1184501b01a49e] source-hash-0db96caf0fcce09b87621c11b584a6d81cc7df86 # good: [5b9dd620df316345477f0b6e6c9ed8ada7b6c091] source-hash-2851ce5afd0f37764cbbc2c2a9a63c7adc844311 git bisect start 'master' 'oldest' # bad: [0c30a2c797b249d0cd804cb71554946e2276b557] source-hash-45aaec8206182c16025cbcb20651ddbdf558b95d git bisect bad 0c30a2c797b249d0cd804cb71554946e2276b557 # good: [770ff0d1a74d2450c2decb349b62c5087e12c46b] source-hash-549b7fad48bb9ddcba7dfa92daea6ce917853a03 git bisect good 770ff0d1a74d2450c2decb349b62c5087e12c46b # bad: [259e888083cf7697956bb7e5f2691e8153eadb4c] source-hash-1884c0bbd40f0ded41d7a1656cb64fb1f6368c36 git bisect bad 259e888083cf7697956bb7e5f2691e8153eadb4c # good: [ee7c82541a2e99f76af570d3faa897504149913a] source-hash-54defd1bd3359c95e45891c7294847d0cebca753 git bisect good ee7c82541a2e99f76af570d3faa897504149913a # bad: [504f60cf9ee84da75d4c15a62dedb18976129c14] source-hash-c8af68bc5adf093f9df803f6fe0147ac9d116169 git bisect bad 504f60cf9ee84da75d4c15a62dedb18976129c14 # good: [00c3cacafec11fdfbdf7f0c8c279503cd109d8a0] source-hash-f21114332bf670ab7f8e9b0a7f4d83d436d8fd9e git bisect good 00c3cacafec11fdfbdf7f0c8c279503cd109d8a0 # bad: [5e1da738abc9f023f0c7bafcffc10d899b57a95b] source-hash-ef296e87b8afa1afdc08a23675658e0252dd2b86 git bisect bad 5e1da738abc9f023f0c7bafcffc10d899b57a95b # good: [bdf9a49d5f818c69487628f49c13bed9bb2bc947] source-hash-df8c7d1c4e9d878797398fa5fd94477b04c2cc00 git bisect good bdf9a49d5f818c69487628f49c13bed9bb2bc947 # good: [7e264ef7d7c3096e9b779e5160c59419b53b138d] source-hash-f0d6e0e1e21afd0adf5bd01d771b2d83d8f13a48 git bisect good 7e264ef7d7c3096e9b779e5160c59419b53b138d # bad: [3aa029ed7028303a0d5ebc84c697840c54c8df41] source-hash-134b523c425613848a2068f917c20a7a67fa0577 git bisect bad 3aa029ed7028303a0d5ebc84c697840c54c8df41 # bad: [4e106cf62e8a97370022bde02efcf044e1ed2c30] source-hash-c0c1b01a32b91984d61f2d0b9146719fcaed7e09 git bisect bad 4e106cf62e8a97370022bde02efcf044e1ed2c30 # good: [4e454b281b3cea9be43fceaa4c201f36a6a3d1be] source-hash-825e4995220209362c13ed5f07c98e43a5f456de git bisect good 4e454b281b3cea9be43fceaa4c201f36a6a3d1be # bad: [8b400e2c6b64ea88b911187a21de7090ee49f305] source-hash-ebf767eeb2a169ba533e1b2ffccf16f41d95df35 git bisect bad 8b400e2c6b64ea88b911187a21de7090ee49f305 # first bad commit: [8b400e2c6b64ea88b911187a21de7090ee49f305] source-hash-ebf767eeb2a169ba533e1b2ffccf16f41d95df35
On master, the behaviour is a bit different. When I open the file with the latest version in the "lo-linux-dbgutil-daily" bibisect repository (source-hash-2d9db406d301d722649ca539cacad823b89191ca), LibreOffice closes with the following assertion error when trying to open the file: soffice.bin: /home/vmiklos/git/libreoffice/master/sw/source/core/bastyp/index.cxx:226: virtual SwIndexReg::~SwIndexReg(): Assertion `!m_pFirst && !m_pLast && "There are still indices registered"' failed.
Looks like a duplicate of an annoying Bug 89100. Pity for 2 bibisecs. *** This bug has been marked as a duplicate of bug 89100 ***
Created attachment 119870 [details] patch to ignore zero size for a graphic in docx bug 89100 is about the uncovering of previously hidden errors, there are likely multiple now visible errors. I've investigated the problem for this document, and it's that the document has a graphic with size 0, 0. (It contains the xml <a:graphic...<a:xfrm><a:off x="0" y="0"/><a:ext cx="0" cy="0"/></a:xfrm>). This fails the test in SwFormatFrmSize::PutValue (MID_FRMSIZE_SIZE) which makes SfxItemPropertySet::setPropertyValue throw an IllegalArgumentException, aborting the parser. The attached patch skips setting the graphic size if the size is 0, 0. With this patch applied I can open the document.
(In reply to libreoffice from comment #15) > ... > The attached patch skips setting the graphic size if the size is 0, 0. With > this patch applied I can open the document. Perhaps you may be interested in contributing directly on LO? (see https://wiki.documentfoundation.org/Development/gerrit)
Migrating Whiteboard tags to Keywords: (bibisected, filter:docx) [NinjaEdit]
This problem is not fixed IMO. 5.1.0rc1 and 5.2.0 daily recent, it either opens in Draw, or (if filter is set explicitly to Ms Word 2007-2013 XML it gives a general IO error
(In reply to libreoffice from comment #15) > The attached patch skips setting the graphic size if the size is 0, 0. With > this patch applied I can open the document. Hi arbruin, It is more common to send patches directly to gerrit. See the link in comment #16 Thanks for looking into this! Cor
*** Bug 96905 has been marked as a duplicate of this bug. ***
(In reply to libreoffice from comment #15) > I've investigated the problem for this document, and it's that the document > has a graphic with size 0, 0. (It contains the xml > <a:graphic...<a:xfrm><a:off x="0" y="0"/><a:ext cx="0" cy="0"/></a:xfrm>). > This fails the test in SwFormatFrmSize::PutValue (MID_FRMSIZE_SIZE) which > makes SfxItemPropertySet::setPropertyValue throw an > IllegalArgumentException, aborting the parser. > > The attached patch skips setting the graphic size if the size is 0, 0. With > this patch applied I can open the document. The fix for bug 95775 already relaxed the abovementioned test in SwFormatFrmSize::PutValue (now it only checks if one of the two values (either x or y) is not 0). That relaxation is apparently not enough for this particular issue; but I beleive that the proper fix here would be not to avoid setting the size as in the patch attached to comment 15, but to completely remove the check from SwFormatFrmSize::PutValue. libreoffice@arbruijn.dds.nl, please move on and post improved patch to gerrit. If you cannot do it, I'll prepare a patch around jan 15th, with proper credit to you. Thank you for your work!
By the way: why would a gvnmt want to insert a zero-sized image into a document? Of course, it's possible that it's intended to be shown on a programmatic event (say, using vba), but it also could be used for tracking purposes...
(In reply to Mike Kaganski from comment #22) > By the way: why would a gvnmt want to insert a zero-sized image into a > document? > Of course, it's possible that it's intended to be shown on a programmatic > event (say, using vba), but it also could be used for tracking purposes... I am aware of several departments of our government using document generating software that allows users to fill in paragraphs of text and metadata such as addresses in a web application, which then generates a document that is almost, but not quite, completely unlike a proper OOXML document. That way, users cannot accidentally modify the government's chosen styling. This may be one of those cases. It is possible to create a document that is well-formed XML, opens fine in Word, and is invalid or nonsensical OOXML, all at the same time. That is, the cause may be incompetence rather than malice, although it would be quite interesting to actually find a government created document that does phone home with tracking information! Duplicate bug 96905 contains an attached document that fails in (probably) the same way. It may be interesting to examine that one as well.
Posted a patch to gerrit: https://gerrit.libreoffice.org/21287
Mike Kaganski committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=654f6ff28d7a148950b48ed8905d8f13a015a5b5 tdf#92157: allow both dimensions of a graphic to be 0 It will be available in 5.2.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Whilst I've committed the patch (thanks whoever posted this - libreoffice@arbuijn.dds.nl, and Mike of course), there are still some unresolved questions. Especially around potential tracking! I've posted on the dev mailing list asking for feedback on whether there is anything further we should be doing.
Hi Jeroen, (In reply to Jeroen Hoek from comment #23) > [...] which then generates a > document that is almost, but not quite, completely unlike a proper OOXML > document. [...] This may be one of those cases. Do I read that right "completely unlike a proper.." > That is, the cause may be incompetence rather than malice, although it would > be quite interesting to actually find a government created document that > does phone home with tracking information! But I guess the 0x0 graphic must be seen as unrelated?
In 5.2 this opens now. In 5.0.x and 5.1.x it does not. @Mike, * Can 654f6ff28d7a148950b48ed8905d8f13a015a5b5 be backported please?
Mike Kaganski committed a patch related to this issue. It has been pushed to "libreoffice-5-1": http://cgit.freedesktop.org/libreoffice/core/commit/?id=88dc41490189a6ccc218c633c6385d4e99af0216&h=libreoffice-5-1 tdf#92157: allow both dimensions of a graphic to be 0 It will be available in 5.1.6. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.