Bug 47669 - Bad processing hyperlink with anchor in DOCX
Summary: Bad processing hyperlink with anchor in DOCX
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.5.1 release
Hardware: x86-64 (AMD64) Windows (All)
: high major
Assignee: Korrawit Pruegsanusak
URL:
Whiteboard: bibisected35 bibisected35older target...
Keywords: regression
: 54214 (view as bug list)
Depends on:
Blocks: DOCX Hyperlink
  Show dependency treegraph
 
Reported: 2012-03-21 11:07 UTC by slavista
Modified: 2017-05-14 04:28 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:


Attachments
utf8 html text for testing, steps to reproduce included (1.02 KB, text/html)
2012-03-21 11:07 UTC, slavista
Details
File created when following the steps to reproduce. (3.79 KB, application/vnd.ms-word.document.12)
2012-03-23 05:01 UTC, s-joyemusequna
Details

Note You need to log in before you can comment on or make changes to this bug.
Description slavista 2012-03-21 11:07:31 UTC
Created attachment 58827 [details]
utf8 html text for testing, steps to reproduce included

Steps to reproduce:
- open attached file in browser (utf8 html/text)
- copy text and paste it into new document in Writer
- save file as DOCX
- close file
- open saved file
- now everything after text "půjčk" is missing, but it is still present in saved docx

Tested on:
LibreOffice 3.5.1.2 / Build ID: dc9775d-05ecbee-0851ad3-1586698-727bf66 - WRONG
LibreOffice 3.4.5 / OOO340m1 (Build:502) - OK

I don't have more versions for testing.

Windows Vista 64bit SP2
Comment 1 s-joyemusequna 2012-03-23 04:53:01 UTC
Confirmed with LibO 3.5.2 RC1.

Everything after text "půjčk" is missing when the file is opened with LibO. Word 2007 can't open the file, I translate "End tag of the element doesn't match the start tag.
Position / component: word/document.xml, line: 2, column: 919" 

The text is present in the file, if I open the file with 7zip.

Works fine with LibO 3.3.4 => REGRESSION
Comment 2 s-joyemusequna 2012-03-23 05:01:04 UTC
Created attachment 58918 [details]
File created when following the steps to reproduce.

This file can't be opened with Word 2007 and is cut off when opened with LibreOffice (see Comment 1)
Comment 3 Korrawit Pruegsanusak 2012-05-13 01:53:57 UTC
Regression does appear in oldest version of bibisect-3.5.tar.lzma and must be older
Comment 4 Ben Konrath 2012-06-26 06:32:11 UTC
I have this problem with LibreOffice Writer from Ubuntu 12.04 as well. 

Version from Help -> About:

LibreOffice 3.5.3.2 
Build ID: 350m1(Build:2)

Version from Ubuntu package:

libreoffice-writer 1:3.5.3-0ubuntu1
Comment 5 Korrawit Pruegsanusak 2012-10-05 14:22:23 UTC
I started to look into this, but if I couldn't complete it before Monday (UTC+7), I'll re-assign back to the list.

The problem is, libo create an extra closing element </w:hyperlink> that isn't balanced (ie, no open tag). When creating file in Word 2007, it doesn't have this element.

The code mapped to this element is XML_hyperlink, which is found only in sw/source/filter/ww8/docxattributeoutput.cxx for 3 places; and the only place invoked in this case is in function DocxAttributeOutput::EndRun() line 536-540

 if ( m_closeHyperlinkInPreviousRun )
     {
         m_pSerializer->endElementNS( XML_w, XML_hyperlink );
         m_closeHyperlinkInPreviousRun = false;
     }

So we create an imbalance end tag only.

Next, m_closeHyperlinkInPreviousRun is true in the third time it is invoked (first and second times are false)
Comment 6 Korrawit Pruegsanusak 2012-10-05 14:33:43 UTC
(In reply to comment #5) 
> Next, m_closeHyperlinkInPreviousRun is true in the third time it is invoked

which is called by:

  AttrOutput().EndRun();

from

(gdb) p m_closeHyperlinkInPreviousRun
$3 = true
(gdb) frame 1
#1  0x00002aaacda115ee in MSWordExportBase::OutputTextNode (this=
    0x7fffffff3ed0, rNode=...)
    at /home/korrawit/core/sw/source/filter/ww8/wrtw8nds.cxx:2044


Searching in OpenGrok for m_closeHyperlinkInPreviousRun <http://opengrok.libreoffice.org/search?q=m_closeHyperlinkInPreviousRun&project=core&defs=&refs=&path=&hist=>, I found that there is only one place that change m_closeHyperlinkInPreviousRun to true: sw/source/filter/ww8/docxattributeoutput.cxx#1062 in function DocxAttributeOutput::RunText(), which is now depend on m_closeHyperlinkInThisRun.

  if( m_closeHyperlinkInThisRun )
  {
      m_closeHyperlinkInPreviousRun = true;
      m_closeHyperlinkInThisRun = false;
  }
Comment 7 Korrawit Pruegsanusak 2012-10-05 15:00:03 UTC
Again, m_closeHyperlinkInThisRun is changed to true only in docxattributeoutput.cxx line 1253, which is in function DocxAttributeOutput::EndURL()

EndURL() is called from:

(gdb) frame 1
#1  0x00002aaacd2d2cb2 in SwWW8AttrIter::OutAttrWithRange (this=
    0x7fffffff3630, nPos=86)
    at /home/korrawit/core/sw/source/filter/ww8/wrtw8nds.cxx:1155
Comment 8 Korrawit Pruegsanusak 2012-10-07 10:30:25 UTC
Proposed a fix at https://gerrit.libreoffice.org/831
Comment 9 Not Assigned 2012-10-08 07:58:19 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=3b042335208cb2c995f4860bf8ba3bd1e2f2e859

Fix fdo#47669: also check if we started the tag before ending it



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 10 Korrawit Pruegsanusak 2012-10-10 12:26:54 UTC
Fixed in master. I will propose a fix for -3-6 branch later.
Comment 11 Korrawit Pruegsanusak 2012-10-13 06:39:01 UTC
*** Bug 54214 has been marked as a duplicate of this bug. ***
Comment 12 Korrawit Pruegsanusak 2012-10-17 11:04:25 UTC
Proposed for -3-6 and -3-6-3 at http://lists.freedesktop.org/archives/libreoffice/2012-October/039770.html
Comment 13 Not Assigned 2012-10-19 15:13:32 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "libreoffice-3-6":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d401dcc51cec798c8c19febcc25c4325ddafa178&g=libreoffice-3-6

Fix fdo#47669: also check if we started the tag before ending it


It will be available in LibreOffice 3.6.4.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 14 Not Assigned 2012-10-22 13:50:19 UTC
Korrawit Pruegsanusak committed a patch related to this issue.
It has been pushed to "libreoffice-3-6-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=d4287fb59bc2bc6cf54fe0ea2786b48ad3afa982&g=libreoffice-3-6-3

Fix fdo#47669: also check if we started the tag before ending it


It will be available already in LibreOffice 3.6.3.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.