Bug 86020 - OOXML: not importing un-documented PPT / binary XML records for characters spacing
Summary: OOXML: not importing un-documented PPT / binary XML records for characters sp...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Impress (show other bugs)
Version:
(earliest affected)
Inherited From OOo
Hardware: Other All
: medium enhancement
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-07 20:59 UTC by Andras Timar
Modified: 2014-11-07 21:07 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
bugdoc (PPT) (122.50 KB, application/powerpoint)
2014-11-07 20:59 UTC, Andras Timar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andras Timar 2014-11-07 20:59:37 UTC
Created attachment 109117 [details]
bugdoc (PPT)

Open the attached .ppt file. In page 3, all of the character spacing information is missing. The problem here is that we don't import the expanded / condensed state. 

Running our ppt dumper tool on two variants of a minimal sample document, with and without character spacing applied in PowerPoint 2013, it is evident that the character spacing is implemented through the use of embedded OOXML in the .ppt. This is a feature of newer .ppt variants (since MSO 2007) and something that as far as I can see OpenOffice.org and LibreOffice does not support.

The ppt dumper displays for a trivial document with a "very tight" character spacing:

     ====================================================================
     [DFF_msofbtUDefProp]
     (type: F122h (61730) inst: 0001h (1), vers: 0003h, start: 86, size: 1850)
     ====================================================================

     F122h: -------------------------------------------------------------
     F122h: Zipped content:
     F122h: 
     F122h: [Content_Types].xml:
     F122h: -------------------------------------------------------------
     F122h: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
     F122h: <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
     F122h:  <Default Extension="rels"
     F122h:           ContentType="application/vnd.openxmlformats-package.relationships+xml">
     F122h:  <Default Extension="xml" ContentType="application/xml">
     F122h:  <Override PartName="/drs/shapexml.xml"
     F122h:            ContentType="application/vnd.ms-office.DrsShape+xml">
     F122h:  <Override PartName="/drs/downrev.xml"
     F122h:            ContentType="application/vnd.ms-office.DrsDownRev+xml">
     F122h: </Types>
     F122h: -------------------------------------------------------------
     F122h: 

[...]

     F122h: drs/shapexml.xml:
     F122h: -------------------------------------------------------------
     F122h: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
     F122h: <p xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main">
     F122h:  <p>
     F122h:   <p id="4" name="TextBox 3">
     F122h:   <p txBox="1">
     F122h:   <p>
     F122h:  </p>

[...]

     F122h:    <a>
     F122h:     <a kumimoji="0" lang="fi-FI" sz="1800" b="0" i="0" u="none"
     F122h:        strike="noStrike" kern="1200" cap="none" spc="-300" normalizeH="0"
     F122h:        baseline="0" noProof="0" dirty="0" smtClean="0">
     F122h:      <a>
     F122h:       <a>
     F122h:      </a>

[...]

where the spc="-300" presumably corresponds to the "very tight" character spacing. In an otherwise identical document with no character spacing applied, this whole embedded OOXML-ish record is not present.

Possibly the MS symbolic name for the 0xf122 is OfficeArtTertiaryFOPT, defined in http://msdn.microsoft.com/en-us/library/dd950206(v=office.12).aspx , part of something called "Office Drawing Binary Format", which can be includes also in PPT, as said on http://msdn.microsoft.com/en-us/library/dd910075(v=office.12).aspx . But I haven't been able to find any documentation for this embedded OOXML stuff.
Comment 1 Joel Madero 2014-11-07 21:07:01 UTC
@Andras - just confirming this as you clearly know what you're talking about :)