Description: If I open .docx files that are created by the CAT Tool "SDL Trados Studio", I receive error messages that are stated below. These docx files just contain bilingual tables. These issues only happen on .docx files and never on .doc files. Unfortunately SDL Trados does not export .doc files. As I'd like to avoid using MS Office by any means, I'd really be pleased if there is a way to read in those docx files correctly without having to convert them to doc or odt first. Steps to Reproduce: 1. Export bilingual file (docx) in SDL Trados Studio 2. Open exported docx with Libre Office Writer. 3. Actual Results: An error occurred during opening the file. This may be caused by incorrect file contents. The error details are: SAXException: [word/document.xml line 1]: unknown error /build/libreoffice-fresh/src/libreoffice-7.0.1.2/sax/source/fastparser/fastparser.cxx:588 Proceed with import may cause data loss or corruption, and application may become unstable or crash. Do you want to ignore the error and attempt to continue loading the file? If yes: File is being opened with less than half of the content showing. If no: File format error found at C++ code threw N403t114divide_by_zeroE: divide by zero /build/libreoffice-fresh/src/libreoffice-7.0.1.2/br idges/source/- cpp_uno/gcc3_linux_x86-64/un02cpp.cxx:243 SAXParseException: '[word/document.xml lirS 1]: unknown error /build/- libreoffice-fresh/src/libreoffice-7.0.1.2/sax/source/fastparser /fastparser.cxx:- 588', Stream 'word/document.xml', Line 1, Column 28832 /build/libreoffice- fresh/src/libreoffice-7.0.1.2/writerfilter/source/filter/WriterFilter.cxx:- 213(row,col). Expected Results: Opening the docx file and showing full content correctly. Reproducible: Always User Profile Reset: No Additional Info: Version: 7.0.1.2 Build ID: 00(Build:2) CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3 Locale: de-DE (de_DE.UTF-8); UI: en-US =7.0.1-1 Calc: threaded
We cannot follow those steps, you need to attach .docx. Note that there are problems with opening generated files, as you may see in other bugs, so this may be a bug, a duplicate or not a bug.
(In reply to Johannes Wülk from comment #0) > Description: > If I open .docx files that are created by the CAT Tool "SDL Trados Studio", > I receive error messages that are stated below. These docx files just > contain bilingual tables. These issues only happen on .docx files and never > on .doc files. Unfortunately SDL Trados does not export .doc files. As I'd > like to avoid using MS Office by any means, I'd really be pleased if there > is a way to read in those docx files correctly without having to convert > them to doc or odt first. > > > Steps to Reproduce: > 1. Export bilingual file (docx) in SDL Trados Studio > 2. Open exported docx with Libre Office Writer. > 3. > > Actual Results: > An error occurred during opening the file. This may be caused by incorrect > file contents. > The error details are: > SAXException: [word/document.xml line 1]: unknown error > /build/libreoffice-fresh/src/libreoffice-7.0.1.2/sax/source/fastparser/ > fastparser.cxx:588 > Proceed with import may cause data loss or corruption, and application may > become unstable or crash. > > Do you want to ignore the error and attempt to continue loading the file? > If yes: File is being opened with less than half of the content showing. > If no: File format error found at C++ code threw N403t114divide_by_zeroE: > divide > by zero /build/libreoffice-fresh/src/libreoffice-7.0.1.2/br idges/source/- > cpp_uno/gcc3_linux_x86-64/un02cpp.cxx:243 > SAXParseException: '[word/document.xml lirS 1]: unknown error /build/- > libreoffice-fresh/src/libreoffice-7.0.1.2/sax/source/fastparser > /fastparser.cxx:- > https://thecoachtrainingacademy.com/ > 588', Stream 'word/document.xml', Line 1, Column 28832 /build/libreoffice- > fresh/src/libreoffice-7.0.1.2/writerfilter/source/filter/WriterFilter.cxx:- > 213(row,col). > > Expected Results: > Opening the docx file and showing full content correctly. > > > Reproducible: Always > > > User Profile Reset: No > > > > Additional Info: > Version: 7.0.1.2 > Build ID: 00(Build:2) > CPU threads: 4; OS: Linux 5.8; UI render: default; VCL: gtk3 > Locale: de-DE (de_DE.UTF-8); UI: en-US > =7.0.1-1 > Calc: threaded You need to attach the doc file.
Created attachment 166243 [details] Docx file
Sorry, I was searching for an option to attach the file but now I could. Thank you for your time. The file opens fine in other Office programs.
I confirm the issue with attached DOCX in LO 7.1+. No issue if DOCX first resaved in MSO, meaning it's probably not proper DOCX. This may still be NotOurBug or WontFix.
DOCX: <?xml version="1.0" encoding="utf-8"?> <w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 wp14"> <w:body> Resaved in MSO: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se w16cid wp14"> <w:body>
I confirm that re-saving the docx in MSO fixes the issue for now.
There is no longer a SAXException since: https://cgit.freedesktop.org/libreoffice/core/commit/?id=67d41607ad3b97abbb939a989e491af932e985a7 author Aron Budea <aron.budea@collabora.com> 2021-02-28 22:04:24 +0100 committer Caolán McNamara <caolanm@redhat.com> 2021-03-01 10:18:06 +0100 tdf#140137 Don't throw exception when w:gridCol is missing "w" attr However it is still bad: Upon opening all paragraph gets a change tracked formatting change, that is not present at all in Word. Let's refocus this bug for that.
Created attachment 173743 [details] The document in Word and current Writer Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 0cda081c9aa3b3dcb363f97bac60c845ce9a13e0 CPU threads: 4; OS: Windows 10.0 Build 18363; UI render: Skia/Raster; VCL: win Locale: hu-HU (hu_HU); UI: en-US Calc: CL
Caolán McNamara committed a patch related to this issue. It has been pushed to "master": https://git.libreoffice.org/core/commit/2be207ed8969a96da8bdc0ffd7f2a2215233ee4a crashtesting: crash on re-export of tdf137357-1.docx to docx It will be available in 7.3.0. The patch should be included in the daily builds available at https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: https://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Created attachment 186278 [details] The example file in Word 2016 and Writer How it looks in 7.5. Turns out there are change tracking information in the original, which span over multiple lines of the table. In Writer each cell gets a change, making the number of entries grow from 14 in Word to 86*3 in Writer - quite annoying to approve/reject :).