Bug 163295 - LibreOffice crashes when processing XML files containing the string "pwi".
Summary: LibreOffice crashes when processing XML files containing the string "pwi".
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: sdk (show other bugs)
Version:
(earliest affected)
7.4.7.2 release
Hardware: All All
: medium normal
Assignee: Mike Kaganski
URL:
Whiteboard: target:25.2.0 target:24.8.3
Keywords:
Depends on:
Blocks:
 
Reported: 2024-10-04 09:57 UTC by marcel.hoedl
Modified: 2024-10-11 16:50 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
text.xml (8 bytes, text/xml)
2024-10-04 10:24 UTC, marcel.hoedl
Details
test.xml (56 bytes, text/xml)
2024-10-04 16:02 UTC, marcel.hoedl
Details

Note You need to log in before you can comment on or make changes to this bug.
Description marcel.hoedl 2024-10-04 09:57:59 UTC
Description:
Whenever I try to convert an XML file to PDF using LibreOffice, the application crashes if the string "pwi" appears anywhere in the XML file. This happens even if the string is not part of the XML declaration but embedded somewhere else in the document. The issue occurs consistently with any valid XML file containing this specific string. I suspect there is an issue with how LibreOffice parses this combination of characters in XML documents. Removing "pwi" prevents the crash.

Steps to Reproduce:
1.Create an XML File with the following content "<?xmlpwi"
2. Call the command "soffice --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf:writer_pdf_Export --outdir . <<CREATED_FILE>>.xml"
3. File is not created no error is thrown

4. Create an XML File with the following content "<?xmlpwa"
5. Call the command "soffice --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf:writer_pdf_Export --outdir . <<CREATED_FILE>>.xml"
6. File is created

Actual Results:
LibreOffice crashes every time when processing the XML file with the string "pwi".

Expected Results:
LibreOffice should process the XML file correctly and generate a PDF without crashing.


Reproducible: Always


User Profile Reset: Yes

Additional Info:
Removing or replacing the string "pwi" resolves the issue, but this is not a suitable workaround, as I cannot control the content of all XML files. This issue may be linked to how the LibreOffice XML parser handles certain character sequences.
Comment 1 Julien Nabet 2024-10-04 10:21:15 UTC
Could you provide an example file of XML file so it'll be quicker to reproduce?

Also, could you give a try to a recent LO version like 24.2.6 ?
Comment 2 marcel.hoedl 2024-10-04 10:24:31 UTC
Created attachment 196887 [details]
text.xml
Comment 3 marcel.hoedl 2024-10-04 10:25:39 UTC
7.6.7 is just the earliest Version we know about that has this bug, we are already on 24.2.6 - and we get the same behaviour
Comment 4 Mike Kaganski 2024-10-04 10:43:14 UTC
(In reply to marcel.hoedl from comment #2)
> Created attachment 196887 [details]

That attachment is *not* an XML file at all. While it may be reasonable to expect LibreOffice to open it as plain text, still your report talks about *XML*, and so, it is also a valid behavior to just ignore invalid XML. Please provide a valid XML showing this problem.
Comment 5 marcel.hoedl 2024-10-04 11:30:13 UTC
If you replace the "pwi" with "pwa" libreoffice is converting the file.
I know thats not valid XML, i just tried to give you a minimal example of the issue.
Comment 6 Mike Kaganski 2024-10-04 11:55:15 UTC
(In reply to marcel.hoedl from comment #5)

GIGO principle doesn't require that different samples of garbage produce the same result. But well, if you decide that a request for a sample doesn't deserve attention, so be it.
Comment 7 marcel.hoedl 2024-10-04 16:02:44 UTC
Created attachment 196897 [details]
test.xml

I was anwsering the last comment from my smartphone.
I did not want ot create a xml fiel on this device.

Here it is.
Comment 8 Mike Kaganski 2024-10-07 13:43:36 UTC
https://gerrit.libreoffice.org/c/core/+/174607
Comment 9 Commit Notification 2024-10-07 14:56:15 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/8f25697591ecfd615a3142528ca13ee4d0d2c562

tdf#163295: XMLFilterDetect: make sure to only detect own types

It will be available in 25.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 marcel.hoedl 2024-10-08 07:19:45 UTC
The lastest dev-version, worked for us.
Do you already have an ETA?
Comment 11 Mike Kaganski 2024-10-08 07:49:11 UTC
(In reply to marcel.hoedl from comment #10)

https://wiki.documentfoundation.org/ReleasePlan
Comment 12 Commit Notification 2024-10-11 16:50:38 UTC
Mike Kaganski committed a patch related to this issue.
It has been pushed to "libreoffice-24-8":

https://git.libreoffice.org/core/commit/9fcc9fe94f5073632c4fc50b153767acfa8f87ff

tdf#163295: XMLFilterDetect: make sure to only detect own types

It will be available in 24.8.3.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.