Description: The attached .docx files give the error when trying to open them. The files are common word forms that can be downloaded from a government website. Steps to Reproduce: 1.Access the file from the link provided 2.Click to open 3.Error appears Actual Results: Error window General Error. General input/output error. Expected Results: Open the file properly to edit Reproducible: Always User Profile Reset: Yes Additional Info: Here is the link to the file: https://www.landgate.wa.gov.au/siteassets/forms/strata-titles-forms/approved/application-to-register-strata-titles-scheme-2020-43934-1.docx
Created attachment 191017 [details] DOCX file that triggers the error
I got the error message about corrupted file too: SAXException: [word/footnotes.xml line 2]: unknown error at C:/cygwin/home/tdf/jenkins/daily_workspace/tb/src_master/sax/source/fastparser/fastparser.cxx:604 But I could to open file after YES pressing in the dialog in Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: 9602f8a9318dd4d3409856e2ae06abe96e72b51b CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Vulkan; VCL: win Locale: ru-RU (ru_RU); UI: ru-RU Calc: threaded but Version: 7.5.0.0.alpha1+ (X86_64) / LibreOffice Community Build ID: 5b18eebc2c95321ce7e6edf10f4df81557382a48 CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Raster; VCL: win Locale: ru-RU (ru_RU); UI: ru-RU Calc: threaded opens the file just good without any error => regression
Created attachment 191028 [details] gdb bt On pc Debian x86-64 with master sources updated today, I could reproduce this.
Let's increase the importance since: - it's a regression - it's on several platforms - it prevents user from opening the file
Seems to have started in: https://git.libreoffice.org/core/+/5082d50d24c3fec4487c724a15eb0d54a82ecd0d author Jaume Pujantell <jaume.pujantell@collabora.com> Wed Sep 13 08:58:21 2023 +0200 committer Jaume Pujantell <jaume.pujantell@collabora.com> Wed Oct 11 15:19:58 2023 +0200 writerfilter: use content controls for text in block SDTs Adding CC to: Jaume Pujantell
it seems there are two issues here because LibreOffice hangs in Version: 7.6.5.0.0+ (X86_64) / LibreOffice Community Build ID: ebc730f8657af18412c47fc671c3c459072da0ea CPU threads: 8; OS: Linux 6.1; UI render: default; VCL: gtk3 Locale: es-ES (es_ES.UTF-8); UI: en-US Calc: threaded
(In reply to Roman Kuznetsov from comment #2) > SAXException: [word/footnotes.xml line 2]: unknown error at > C:/cygwin/home/tdf/jenkins/daily_workspace/tb/src_master/sax/source/ > fastparser/fastparser.cxx:604 What one can do in this case to get a more accurate location in the file is: - unzip the OOXML file, - in Linux/Cygwin have xmllint installed, and run the following in the subdirectory where the extracted content is: find . -name "*.xml" -type f -exec xmllint --output '{}' --format '{}' \; - rezip the extracted files, give it the correct OOXML extension (eg. .docx), and open it in LO again. Now the error says this: "File format error found at SAXParseException: '[word/footnotes.xml line 183]: unknown error', Stream 'word/footnotes.xml', Line 183, Column 1 SAXParseException: '[word/document.xml line 1088]: unknown error', Stream 'word/document.xml', Line 1088, Column 17(row,col)." word/footnotes.xml line 183 is the end of the file, which isn't particularly helpful. word/document.xml line 1088, column 17 points to the beginning of this element: <w:color w:val="839E25"/> Neither of those made sense at first sight, but perhaps it will to the person eventually looking at this bug.
Opening of a single file is not Major but Normal. But bisect does not look good. Before 5082d50d24c3fec4487c724a15eb0d54a82ecd0d file does not open for me. Not opening started with: commit c303981cfd95ce1c3881366023d5495ae2edce97 [log] author Michael Stahl <michael.stahl@allotropia.de> Wed Aug 23 15:50:59 2023 +0200 committer Michael Stahl <michael.stahl@allotropia.de> Thu Aug 24 12:43:25 2023 +0200 tdf#156724 sw: layout: fix tables not splitting due to footnotes differently Revert commit 610c6f02b11b4b4c555a78b0feb2a1eb35159e39 and and 61a78a523a6131ff98b5d846368e5626fe58d99c instead do the opposite: never calc content frames in FormatLayout(). But before that, file would open and then crash. Crashing started with: commit 6fd755fb36472938757b2581cbe99f5e5fe1ae40 [log] author Noel Grandin <noel.grandin@collabora.co.uk> Tue Aug 22 14:44:38 2023 +0200 committer Noel Grandin <noel.grandin@collabora.co.uk> Wed Aug 23 18:25:04 2023 +0200 tdf#100894 lots of Conditional formatting freeze calc Styles sidebar use bulk_insert_for_each to reduce time spent adding to treebox Takes it from "who knows, I gave up", to about 10 seconds on my machine Before that, when file would open, I could write anywhere instead of allowed parts, and could not tick checkboxes. So this does not really sound like regression, rather never correct fileopen.