Bug 119087 - FILESAVE PPTX: PowerPoint wants to repair the file after roundtrip and cannot open it
Summary: FILESAVE PPTX: PowerPoint wants to repair the file after roundtrip and cannot...
Status: VERIFIED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Impress (show other bugs)
Version:
(earliest affected)
6.1.0.0.alpha0+
Hardware: All All
: medium normal
Assignee: Samuel Mehrbrodt (allotropia)
URL:
Whiteboard: target:7.0.0 target:6.4.4
Keywords: bibisected, bisected, regression
Depends on:
Blocks: PPTX-Corrupted
  Show dependency treegraph
 
Reported: 2018-08-03 21:45 UTC by Regina Henschel
Modified: 2020-04-29 08:27 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
File produced by PowerPoint (14.68 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2018-08-03 21:45 UTC, Regina Henschel
Details
File saved by LibreOffice 6102 (22.22 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2018-08-03 21:46 UTC, Regina Henschel
Details
File saved by LibreOffice 6042 (20.38 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2018-08-03 21:47 UTC, Regina Henschel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Regina Henschel 2018-08-03 21:45:13 UTC
Created attachment 143960 [details]
File produced by PowerPoint

The attached document contains a simple WordArt without transformation or special effects like glow or shadow.
Open the file and save it with new name.
Open the saved file in PowerPoint. It wants to repair it, but "repair" does not work and PowerPoint cannot open the file.

The file was produced be PowerPoint 365, Version 1807.

There was no problem with open and save with Version: 6.0.4.2 (x64)
Build ID: 9b0d9b32d5dcda91d2f1a96dc04c645c450872bf
CPU threads: 8; OS: Windows 10.0; UI render: default; 
Locale: de-DE (en_US); Calc: CL

It fails with Version: 6.1.0.2 (x64)
Build ID: b3972dcf1284967612d5ee04fea9d15bcf0cc106
CPU threads: 8; OS: Windows 10.0; UI render: default; 
Locale: de-DE (en_US); Calc: CL
Comment 1 Regina Henschel 2018-08-03 21:46:17 UTC
Created attachment 143961 [details]
File saved by LibreOffice 6102
Comment 2 Regina Henschel 2018-08-03 21:47:20 UTC
Created attachment 143962 [details]
File saved by LibreOffice 6042
Comment 3 Regina Henschel 2018-08-03 22:10:47 UTC
If you unpack the corrupted file and open the part presentation.xml, you will see an end-tag </p:presentation> with content after it. At the end of the part presentation.xml you see an end-tag </p:defaultTextStyle> which has no start-tag. The file is indeed corrupt.
Comment 4 Regina Henschel 2018-08-05 15:15:47 UTC
The error only occurs, if the file was saved in OOXML strict by PowerPoint.

The faulty "presentation.xml" looks as if the new exported elements are written on top of the original file, so that there is a leftover from the original file starting around column 667 in line 2.
Comment 5 Xisco Faulí 2018-08-06 10:31:42 UTC
Regression introduced by:

https://cgit.freedesktop.org/libreoffice/core/commit/?id=8f79f22a8d4b1c2d209c55cd618c24428960088f

author	Ashod Nakashian <ashod.nakashian@collabora.co.uk>	2018-03-06 22:43:34 -0500
committer	Jan Holesovsky <kendy@collabora.com>	2018-03-08 12:40:19 +0100
commit 8f79f22a8d4b1c2d209c55cd618c24428960088f (patch)
tree 0447e43cdca126688699ffd76665c130247b7384
parent 4de1c0223ceb76556ff1c20000b4ea95bfc1d2a0 (diff)
oox: preserve the ContentType of custom files
Generic logic to preserve custom files with
their correct ContentType. Standard default
file extensions with respective ContentType
preserved in [Content_Types].xml.

Bisected with: bibisect-linux64-6.1

Adding Cc: to Ashod Nakashian
Comment 6 Ashod Nakashian 2018-08-08 02:05:36 UTC
(In reply to Xisco Faulí from comment #5)
> Regression introduced by:
> 
> https://cgit.freedesktop.org/libreoffice/core/commit/
> ?id=8f79f22a8d4b1c2d209c55cd618c24428960088f
> 

Thanks Xisco for chasing this.

Can you provide more info please? I'm not sure what 'strict mode' is or how to set it (in fact I don't have MSO). Also, can you provide an example of the output you describe in the previous comment regarding the leftover please? Perhaps a screenshot between expected and actual?

Also, how common is this issue? 

Thanks
Comment 7 Regina Henschel 2018-08-08 09:07:41 UTC
(In reply to Ashod Nakashian from comment #6)
 
> Can you provide more info please? I'm not sure what 'strict mode' is or how
> to set it (in fact I don't have MSO).

The pptx format is specified in ISO/IEC 29500. But Microsoft might add some extensions to that format (e.g. for compatibility to old ppt format), which are not specified and are not handled by the extension mechanism provided in the specification. In MSO you can decide whether to use these extensions or to save to 'strict' format. The specification describes the conformance classes in §2.1 of part 1. The file presentation.xml has in element <p:presentation> the attribute 'conformance="strict"'. The other attribute value would be 'transitional', which is the default in case the attribute is missing.

 Also, can you provide an example of
> the output you describe in the previous comment regarding the leftover
> please? Perhaps a screenshot between expected and actual?

It is nothing about 'screen', but about the files. The pptx format is a simple zip-container. So unpack the three attached files. Then open the file 'presentation.xml' from subfolder 'ppt' in a text editor and compare theme. 

> 
> Also, how common is this issue?

I don't know. A private user will usually not use the 'strict' format, because it is not the default format. I don't know whether there exists enterprises or authorities that force the use of 'strict' format.
Comment 8 Aron Budea 2018-11-05 05:25:57 UTC
The source of this bug is that when the code is checking for custom XMLs, it checks the content of "xmlns" attributes, and if they're not "http://schemas.openxmlformats.org", they're treated as a custom XML, and the content is supposed to be preserved. This doesn't go well when the XML is also part of the file format.

In this PPTX file those attributes have this prefix instead: "http://purl.oclc.org/ooxml/drawingml/main" (purl.oclc.org is a URL shortener)

I assume XMLs that have override entries in "[Content_Types].xml" should never be treated as possible custom XMLs.
Comment 9 QA Administrators 2019-11-06 03:31:48 UTC Comment hidden (obsolete)
Comment 10 Regina Henschel 2019-11-06 12:13:10 UTC
The problem still exists in Version: 6.4.0.0.alpha1+ (x64)
Build ID: 7c6226bee72805db7f0e567ca9f06c786a7d0da2
CPU threads: 8; OS: Windows 10.0 Build 18362; UI render: default; VCL: win; 
Locale: de-DE (en_US); UI-Language: en-US
Calc: threaded
Comment 11 Commit Notification 2020-04-29 05:07:01 UTC
Samuel Mehrbrodt committed a patch related to this issue.
It has been pushed to "master":

https://git.libreoffice.org/core/commit/9be543a27ab18427a1c4e66a70cc49b0332b6aa1

tdf#119087 Don't treat OOXML strict namespace as custom XML

It will be available in 7.0.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 12 Commit Notification 2020-04-29 07:54:16 UTC
Samuel Mehrbrodt committed a patch related to this issue.
It has been pushed to "libreoffice-6-4":

https://git.libreoffice.org/core/commit/68f75fe0701fcf9b92c5f1b5fd5eeb9268297494

tdf#119087 Don't treat OOXML strict namespace as custom XML

It will be available in 6.4.4.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
https://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 13 Xisco Faulí 2020-04-29 08:27:28 UTC
Verified in

Version: 7.0.0.0.alpha0+
Build ID: cf36fe5eb41910c26d58fb25e54ccf2e0ee01365
CPU threads: 4; OS: Linux 4.19; UI render: default; VCL: gtk3; 
Locale: en-US (en_US.UTF-8); UI-Language: en-US
Calc: threaded

@Samuel, thanks for fixing this issue!