Bug 155903 - Writer: Error opening a .docx file containing audio inserted files
Summary: Writer: Error opening a .docx file containing audio inserted files
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
7.3.3.2 release
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Tünde Tóth
URL:
Whiteboard: target:24.2.0 target:7.6.0.0.beta2
Keywords: bibisected, bisected, dataLoss, filter:docx, regression
Depends on:
Blocks: DOCX-Corrupted Media
  Show dependency treegraph
 
Reported: 2023-06-17 20:54 UTC by FLS
Modified: 2023-07-05 16:27 UTC (History)
4 users (show)

See Also:
Crash report or crash signature:


Attachments
Sample of a corrupted file (98.52 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-06-17 20:59 UTC, FLS
Details
Screenshot error with word. (18.99 KB, image/png)
2023-06-17 22:20 UTC, m_a_riosv
Details
A sample file odt format with audio .mp3 embedded (106.12 KB, application/vnd.oasis.opendocument.text)
2023-06-19 13:08 UTC, FLS
Details
Same sample file saved as .docx file (Word 2007-365) (98.62 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2023-06-19 13:11 UTC, FLS
Details

Note You need to log in before you can comment on or make changes to this bug.
Description FLS 2023-06-17 20:54:08 UTC
Description:
When creating a doc file and saving it in the .docx format, there is no reported problem to save it.

BUT, after closing the file normally, an error occcurs when attempting to open it again. The following error message appears:

Detalles del error:
SAXException: [word/document.xml line 2]: Namespace prefix p14 on media is not defined

Si continúa con la importación, podrían producirse pérdidas o daños de datos y la aplicación podría volverse inestable o cerrarse inesperadamente.

Going ahead, it imports only the data previous to the first audio embedded file. The rest is lost.

Steps to Reproduce:
1.Open the file submitted in the attachment
2. Observe the error report
3.

Actual Results:
Always the same error when aopening the file occurs.

Expected Results:
To be able to save doc documents with audio embedded files in the .docx format.


Reproducible: Always


User Profile Reset: No

Additional Info:
[Information automatically included from LibreOffice]
Locale: es
Module: TextDocument
[Information guessed from browser]
OS: Windows (All)
OS is 64bit: no
Comment 1 FLS 2023-06-17 20:59:49 UTC
Created attachment 187966 [details]
Sample of a corrupted file
Comment 2 FLS 2023-06-17 21:46:53 UTC
The same file saved in the native ODF fprmat is OK. It has NO PROBLEM AT ALL.
That means the problem arises when converting the file to the .docx format.
The bug is inside the converter.
Comment 3 m_a_riosv 2023-06-17 22:20:02 UTC
Created attachment 187967 [details]
Screenshot error with word.

The file shows an error window opening with word.
Comment 4 Stéphane Guillou (stragu) 2023-06-19 11:54:57 UTC
Can you please share a simple ODT file that has several audio file embedded, so we can test the saving as DOCX?

https://bugs.documentfoundation.org/attachment.cgi?bugid=155903&action=enter
Comment 5 FLS 2023-06-19 13:08:07 UTC
Created attachment 187992 [details]
A sample file odt format with audio .mp3 embedded

Responding to your request, pls find attached this sample file.
Comment 6 FLS 2023-06-19 13:11:12 UTC
Created attachment 187993 [details]
Same sample file saved as .docx file (Word 2007-365)

This file has been produced by saving the original .odt file as a .docx file.

Best wishes in the debugging process!
Comment 7 Stéphane Guillou (stragu) 2023-06-19 17:31:19 UTC
Thank you!
Reproduced with attachment 187992 [details] saved as DOCX, with:

Version: 7.5.4.2 (X86_64) / LibreOffice Community
Build ID: 36ccfdc35048b057fd9854c757a8b67ec53977b6
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: fr-FR (en_AU.UTF-8); UI: en-US
Calc: threaded

Before 7.3, no error message at fileopen but audio would be lost. Office.com can open the file.

Since 7.3, error message appears:

"An error occurred during opening the file. This may be caused by incorrect file contents.
The error details are:
SAXException: [word/document.xml line 2]: Namespace prefix p14 on media is not define at /home/buildslave/source/libo-core/sax/source/fastparser/fastparser.cxx:615
Proceeding with import may cause data loss or corruption, and application may become unstable or crash."

- If select "No" when asked to ignore the issue, another error message before closing.
- If select "Yes", more than the audio is lost (audio, some 
text and one comment)

Office.com can't open the file.

Bibisected with linux-64-7.3 repository to first bad commit 46b3c646a981467de20bede453aebb9b846824c0 which points to b64c55169d72bfde6aee00673a56d5c25acfd4d4 which is a cherrypick of:

commit bc72514f90d90e1ab3fed8167663e835edf03508
author	Tünde Tóth <toth.tunde@nisz.hu>	Thu Mar 24 16:54:01 2022 +0100
committer	László Németh <nemeth@numbertext.org>	Tue Mar 29 12:59:55 2022 +0200
tdf#53970 PPTX: fix export of embedded media files
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/132095

Tünde can you please have a look?
Comment 8 FLS 2023-06-21 02:33:51 UTC
Today, trying the PDF converter for this file. It works perfectly. Then, to solve the bug in converting to .docx is not anymore a high prority.
Thanks to all.
Comment 9 Commit Notification 2023-06-28 11:39:08 UTC
Tünde Tóth committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/863a32171ed8efdf1aaee59918e49613e7ccd7a9

tdf155903 DOCX export: fix corrupt file with embedded media

It will be available in 24.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 10 Commit Notification 2023-06-30 11:34:45 UTC
Tünde Tóth committed a patch related to this issue.
It has been pushed to "libreoffice-7-6":

https://git.libreoffice.org/core/commit/4b323aebda7fcafbcb3e37101f0b0dc882fb824b

tdf155903 DOCX export: fix corrupt file with embedded media

It will be available in 7.6.0.0.beta2.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 11 Stéphane Guillou (stragu) 2023-07-05 16:27:37 UTC
Tünde, I tested the fix but the exported DOCX does not have the audio anymore. That's an expected DOCX limitation, right?

Version: 24.2.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: d74344f6cae0cf1c12f08249c8f49be1374fb98f
CPU threads: 8; OS: Linux 5.15; UI render: default; VCL: gtk3
Locale: en-AU (en_AU.UTF-8); UI: en-US
Calc: threaded