Bug 53175 - FILEOPEN error reading “Content (TOC)” from a DOCX
Summary: FILEOPEN error reading “Content (TOC)” from a DOCX
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: Other All
: medium critical
Assignee: Not Assigned
URL:
Whiteboard: target:3.7.0 target:3.6.1 target:3.5.7
Keywords:
Depends on:
Blocks: mab3.5
  Show dependency treegraph
 
Reported: 2012-08-06 17:19 UTC by ape
Modified: 2017-08-24 23:40 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
Bad DOCX file (50.53 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2012-08-06 17:19 UTC, ape
Details
Figure as example (20.40 KB, image/png)
2012-08-06 20:50 UTC, ape
Details
bt on 3.5 (16.87 KB, text/plain)
2012-08-11 13:45 UTC, Julien Nabet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ape 2012-08-06 17:19:48 UTC
Created attachment 65189 [details]
Bad DOCX file

LibreOffice:
Version 3.7.0.0.alpha0+ (Build ID: 76ccb5f)
Version 3.6.1.0+ (Build ID: eb42588)
Version 3.5.7.0+ - I think
---
Description:
a) Open the file "DOC-export footnote Error.odt" (see attachment 57017 [details]) and saved it as a Office Open XML text document (Microsoft Office 2007/2010) "DOC-export footnote Error.docx".
b) Open DOCX-file programs Microsoft Word – all correct.
c) Open "DOC-export footnote Error.docx" by LibO Writer and then closed it. LibO post message: The document "..." has been modified. Do you want to save your changes? > Push the Save button. (This file named "DOCX_import_TOC_export.docx" is into attachment.)
d) Open the saved file by Microsoft Office. WinWord says: "Cannot open file due to errors of its contents. Details: Unspecified error. Location: Part:/word/document.xml, Line: 2, Column: 6136.
--
@Cedric:
It seems to me that it is symmetric error reading “Content (TOC)” from a DOCX file, which remained after the bug 52610 fixation.
Comment 1 ape 2012-08-06 20:50:58 UTC
Created attachment 65209 [details]
Figure as example

Fragments of files “Part:/word/document.xml” is shown in figure (see attachment).
On the left: Part = DOC-export footnote Error.docx (import ODT) [OK]
On the right: Part = DOCX_import_TOC_export.docx (export DOCX) [Error]
It seems to me that the import filter tries to save the TOC with hyperlinks (Content will be work), indicating the sections (headers) number, instead of page numbers. This causes an error. (“Hand” Cursor is shown on the right in Figure).
Now (after fixing bug 52610) export filter “ODT_to_OOXML” creates TOC, which is not working in WinWord, but retains the numbering. It’s rightly in my opinion. ("Arrow" Cursor is shown on the left in the picture).
--
P.S. I don't know of a program that today will be able to solve this TOC problem.
Comment 2 Petr Mladek 2012-08-07 08:58:31 UTC
LO-3.5.6 even crashes when it tries to read the document after the second save.

Steps to reproduce (build from libreoffice-3-5 branch):

1. Open "DOC-export footnote Error.odt" (attachment 57017 [details]) in LO
2. Save as 1.docx
3. Open 1.docx in LO
4. Save as 2.docx
5. Open 2.docx in LO

Result: crash
Comment 3 ape 2012-08-07 16:04:39 UTC
An interesting feature of each new cycle, "Save - Open DOCX":
a) Added by one “Tab” space before the first word in an each footnote.
b) In the TOC there is a new empty paragraph - hyperlink. It seems to me that it causes a read error.
Comment 4 ape 2012-08-08 03:05:07 UTC
(In reply to comment #1)
> Now (after fixing bug 52610) export filter “ODT_to_OOXML” creates TOC, which is
> not working in WinWord...

Sorry, it's wrong:
 1st DOCX file has the content that runs in WinWord-2003sp3 (with File Format Converter SP3).
 --
P.S. I'll check WinWord-XP (2002) and WinWord-2007 later.
Comment 5 ape 2012-08-08 05:16:48 UTC
(In reply to comment #4)
> (In reply to comment #1)
> > Now (after fixing bug 52610) export filter “ODT_to_OOXML” creates TOC, which is not working in WinWord...
> 
> Sorry, it's wrong:
>  1st DOCX file has the content that runs in WinWord-2003sp3 (with File Format
> Converter SP3).
>  --
> P.S. I'll check WinWord-XP (2002) and WinWord-2007 later.

1st DOCX-file's Content (TOC) is working at WinWord-2007sp3.
Comment 6 Julien Nabet 2012-08-11 13:45:21 UTC
Created attachment 65428 [details]
bt on 3.5

(In reply to comment #2)
On pc Debian x86-64, with 3.5 sources updated 2 days ago, I reproduced the crash by following your steps.
I attached the bt
I noticed too that first page of docx1 was wrong (schema above the TOC) ; haven't checked other pages.
Comment 7 Not Assigned 2012-08-20 03:42:31 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=82c46f1877c65042ac976312267c14bf0e5847f4

fdo#53175: docx export, close hyperlinks before fields
Comment 8 Cédric Bosdonnat 2012-08-20 05:24:46 UTC
(In reply to comment #0)
> d) Open the saved file by Microsoft Office. WinWord says: "Cannot open file due
> to errors of its contents. Details: Unspecified error. Location:
> Part:/word/document.xml, Line: 2, Column: 6136.

Fixed in master now: forgot to end the hyperlink before the end of fields

(In reply to comment #2)
> LO-3.5.6 even crashes when it tries to read the document after the second save.
> 
> Steps to reproduce (build from libreoffice-3-5 branch):
> 
> 1. Open "DOC-export footnote Error.odt" (attachment 57017 [details]) in LO
> 2. Save as 1.docx
> 3. Open 1.docx in LO
> 4. Save as 2.docx
> 5. Open 2.docx in LO
> 
> Result: crash

That crash is not related to the fix I produced: it can be reproduced with the steps you indicated only with 3.5 (not including the fix). I would consider that as another bug.

(In reply to comment #6)
> I attached the bt
> I noticed too that first page of docx1 was wrong (schema above the TOC) ;
> haven't checked other pages.

Thanks for the backtrace. However the schema problem is fixed in master and is likely to be pretty annoying to backport
Comment 9 ape 2012-08-20 06:55:01 UTC
(In reply to comment #7)
> Cedric Bosdonnat committed a patch related to this issue.
> It has been pushed to "master":
> 
> http://cgit.freedesktop.org/libreoffice/core/commit/?id=82c46f1877c65042ac976312267c14bf0e5847f4
> 
> fdo#53175: docx export, close hyperlinks before fields

Cedric, thank you.
 LibO-3.6.1.1 is in need of your patch too. Release contains previous fixes TOC (bug 52610). I hope that in LibO-3.6.1rc2 problem would be closed.
--
ape
Comment 10 Cédric Bosdonnat 2012-08-20 09:35:37 UTC
I had to revert the fix as it introduces some other problems... will need to find a better fix.
Comment 11 Not Assigned 2012-08-20 09:37:00 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=b95d203bc17c83ec0fe5139f519d53ed1d842d3a

fdo#53175: Don't load the default values of the styles in writerfilter
Comment 12 Not Assigned 2012-08-20 09:37:20 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=516c4f2a7f0af06f7e6301128df9885599128291

Revert "fdo#53175: docx export, close hyperlinks before fields"
Comment 13 Not Assigned 2012-08-21 05:14:11 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=c1c2688912e769dfd7654e11e87dae380a8ce1eb

fdo#53175: Fixed the end of hyperlinks
Comment 14 Cédric Bosdonnat 2012-08-21 05:20:09 UTC
This time it is ok. Note that also the import of Title style has been fixed.
Comment 15 Not Assigned 2012-08-21 07:32:49 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=f672ba611a8cef83acad3233d27dc6414b463a42&g=libreoffice-3-6

fdo#53175: Fixed the end of hyperlinks


It will be available in LibreOffice 3.6.2.
Comment 16 Not Assigned 2012-08-21 13:37:54 UTC
Cedric Bosdonnat committed a patch related to this issue.
It has been pushed to "libreoffice-3-6-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=636e8f07d17f33e3b67137be805f8a0fdf3e0299&g=libreoffice-3-6-1

fdo#53175: Fixed the end of hyperlinks


It will be available already in LibreOffice 3.6.1.