Bug 149996 - A hyperlink with an anchored-to-character shape with text results in corrupt DOCX
Summary: A hyperlink with an anchored-to-character shape with text results in corrupt ...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium normal
Assignee: Tünde Tóth
URL:
Whiteboard: target:7.6.0 target:7.5.3
Keywords: bibisected, bisected, filter:docx, regression
: 150191 (view as bug list)
Depends on:
Blocks: DOCX-SAXParse DOCX-Corrupted DOCX-Anchor-and-Text-Wrap DOCX-Hyperlink
  Show dependency treegraph
 
Reported: 2022-07-14 17:18 UTC by Mike Kaganski
Modified: 2023-03-20 07:39 UTC (History)
8 users (show)

See Also:
Crash report or crash signature:


Attachments
A hyperlink with an anchored shape with text (10.20 KB, application/vnd.oasis.opendocument.text)
2022-07-14 17:18 UTC, Mike Kaganski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Kaganski 2022-07-14 17:18:44 UTC
Created attachment 181266 [details]
A hyperlink with an anchored shape with text

Saving the attached document as DOCX gives a corrupt file, which gives "SAXException: [word/document.xml line 2]: Opening and ending tag mismatch: p line 2 and hyperlink" error.

This is a regression since 4.3; version 4.2 produced a normal file.
Comment 1 Rafael Lima 2022-07-14 18:00:41 UTC
Repro with

Version: 7.3.4.2 / LibreOffice Community
Build ID: 30(Build:2)
CPU threads: 12; OS: Linux 5.15; UI render: default; VCL: kf5 (cairo+xcb)
Locale: pt-BR (pt_BR.UTF-8); UI: en-US
Ubuntu package version: 1:7.3.4-0ubuntu0.22.04.1
Calc: threaded

Converting to DOCX and opening the file will cause the following error:

File format error found at 
SAXParseException: '[word/document.xml line 2]: Opening and ending tag mismatch: p line 2 and hyperlink
 ./sax/source/fastparser/fastparser.cxx:619', Stream 'word/document.xml', Line 2, Column 2321 ./writerfilter/source/filter/WriterFilter.cxx:213(row,col).
Comment 2 Rafael Lima 2022-07-14 18:02:25 UTC
Also repro with

Version: 7.5.0.0.alpha0+ / LibreOffice Community
Build ID: 61f5c991a97de8990badfed6ef840941b5aa8c7f
CPU threads: 12; OS: Linux 5.15; UI render: default; VCL: kf5 (cairo+xcb)
Locale: pt-BR (pt_BR.UTF-8); UI: en-US
Calc: threaded
Comment 3 raal 2022-07-14 18:19:58 UTC Comment hidden (obsolete)
Comment 4 Mike Kaganski 2022-07-15 04:38:54 UTC
(In reply to raal from comment #3)

Oh - you likely bisected to see the error message on load. But the real problem appeared long before - in 4.3, somewhere in early 2014 (or even in late 2013), when the saved document got corrupted.

Before the commit that you identified, the corruption was silently ignored, and simply the part of file before the identified XML error was imported. In this case, you may see that in 4.2, the whole file got imported after export; in 4.3, the second part of the hyperlink gets cut off. This is what should be checked in bibisection.

Thank you, and sorry for me not clarifying that initially.
Comment 5 Mike Kaganski 2022-07-19 09:12:49 UTC
Regression after commit 20a3792502120d67b1a9fdea641e15ea504359d3
  Author Pallavi Jadhav <pallavi.jadhav@synerzip.com>
  Date   Wed Mar 19 16:29:42 2014 +0530
    fdo#76316 : File gets corrupt after Roundtrip
Comment 6 Mike Kaganski 2022-07-29 07:01:58 UTC
*** Bug 150191 has been marked as a duplicate of this bug. ***
Comment 7 Commit Notification 2023-03-10 11:47:54 UTC
Tünde Tóth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/05df784c6febd1d77e95db8ef3dfdc03347a48a3

tdf#149996 DOCX export: fix hyperlinks in nested paragraphs

It will be available in 7.6.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 8 Commit Notification 2023-03-13 10:24:39 UTC
Tünde Tóth committed a patch related to this issue.
It has been pushed to "libreoffice-7-5":

https://git.libreoffice.org/core/commit/619132f022b7a71938069d7c282aaf8b578287e5

tdf#149996 DOCX export: fix hyperlinks in nested paragraphs

It will be available in 7.5.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 9 NISZ LibreOffice Team 2023-03-20 07:39:39 UTC
VERIFIED IN:
Version: 7.6.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: b5c3a7502f7ff6ccf0f829c1f3a2ba50b8584c41
CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win
Locale: hu-HU (hu_HU); UI: hu-HU
Calc: CL threaded