Created attachment 58827 [details] utf8 html text for testing, steps to reproduce included Steps to reproduce: - open attached file in browser (utf8 html/text) - copy text and paste it into new document in Writer - save file as DOCX - close file - open saved file - now everything after text "půjčk" is missing, but it is still present in saved docx Tested on: LibreOffice 3.5.1.2 / Build ID: dc9775d-05ecbee-0851ad3-1586698-727bf66 - WRONG LibreOffice 3.4.5 / OOO340m1 (Build:502) - OK I don't have more versions for testing. Windows Vista 64bit SP2
Confirmed with LibO 3.5.2 RC1. Everything after text "půjčk" is missing when the file is opened with LibO. Word 2007 can't open the file, I translate "End tag of the element doesn't match the start tag. Position / component: word/document.xml, line: 2, column: 919" The text is present in the file, if I open the file with 7zip. Works fine with LibO 3.3.4 => REGRESSION
Created attachment 58918 [details] File created when following the steps to reproduce. This file can't be opened with Word 2007 and is cut off when opened with LibreOffice (see Comment 1)
Regression does appear in oldest version of bibisect-3.5.tar.lzma and must be older
I have this problem with LibreOffice Writer from Ubuntu 12.04 as well. Version from Help -> About: LibreOffice 3.5.3.2 Build ID: 350m1(Build:2) Version from Ubuntu package: libreoffice-writer 1:3.5.3-0ubuntu1
I started to look into this, but if I couldn't complete it before Monday (UTC+7), I'll re-assign back to the list. The problem is, libo create an extra closing element </w:hyperlink> that isn't balanced (ie, no open tag). When creating file in Word 2007, it doesn't have this element. The code mapped to this element is XML_hyperlink, which is found only in sw/source/filter/ww8/docxattributeoutput.cxx for 3 places; and the only place invoked in this case is in function DocxAttributeOutput::EndRun() line 536-540 if ( m_closeHyperlinkInPreviousRun ) { m_pSerializer->endElementNS( XML_w, XML_hyperlink ); m_closeHyperlinkInPreviousRun = false; } So we create an imbalance end tag only. Next, m_closeHyperlinkInPreviousRun is true in the third time it is invoked (first and second times are false)
(In reply to comment #5) > Next, m_closeHyperlinkInPreviousRun is true in the third time it is invoked which is called by: AttrOutput().EndRun(); from (gdb) p m_closeHyperlinkInPreviousRun $3 = true (gdb) frame 1 #1 0x00002aaacda115ee in MSWordExportBase::OutputTextNode (this= 0x7fffffff3ed0, rNode=...) at /home/korrawit/core/sw/source/filter/ww8/wrtw8nds.cxx:2044 Searching in OpenGrok for m_closeHyperlinkInPreviousRun <http://opengrok.libreoffice.org/search?q=m_closeHyperlinkInPreviousRun&project=core&defs=&refs=&path=&hist=>, I found that there is only one place that change m_closeHyperlinkInPreviousRun to true: sw/source/filter/ww8/docxattributeoutput.cxx#1062 in function DocxAttributeOutput::RunText(), which is now depend on m_closeHyperlinkInThisRun. if( m_closeHyperlinkInThisRun ) { m_closeHyperlinkInPreviousRun = true; m_closeHyperlinkInThisRun = false; }
Again, m_closeHyperlinkInThisRun is changed to true only in docxattributeoutput.cxx line 1253, which is in function DocxAttributeOutput::EndURL() EndURL() is called from: (gdb) frame 1 #1 0x00002aaacd2d2cb2 in SwWW8AttrIter::OutAttrWithRange (this= 0x7fffffff3630, nPos=86) at /home/korrawit/core/sw/source/filter/ww8/wrtw8nds.cxx:1155
Proposed a fix at https://gerrit.libreoffice.org/831
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=3b042335208cb2c995f4860bf8ba3bd1e2f2e859 Fix fdo#47669: also check if we started the tag before ending it The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Fixed in master. I will propose a fix for -3-6 branch later.
*** Bug 54214 has been marked as a duplicate of this bug. ***
Proposed for -3-6 and -3-6-3 at http://lists.freedesktop.org/archives/libreoffice/2012-October/039770.html
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "libreoffice-3-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=d401dcc51cec798c8c19febcc25c4325ddafa178&g=libreoffice-3-6 Fix fdo#47669: also check if we started the tag before ending it It will be available in LibreOffice 3.6.4. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Korrawit Pruegsanusak committed a patch related to this issue. It has been pushed to "libreoffice-3-6-3": http://cgit.freedesktop.org/libreoffice/core/commit/?id=d4287fb59bc2bc6cf54fe0ea2786b48ad3afa982&g=libreoffice-3-6-3 Fix fdo#47669: also check if we started the tag before ending it It will be available already in LibreOffice 3.6.3. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.