A Windows user friend of mine recently asked me to convert an Apple Pages document to word format because their Apple user friend had sent it in a Apple Pages format, which nothing outside of Apple/Google can view. Luckily the format includes an embedded PDF file, which worked for the use-case, but it would be nice if LibreOffice could natively open Apple Pages format and save it as other formats. There is a blog post here about the format: http://xorglog.blogspot.com/2009/05/how-to-edit-mac-os-pages-documents-in.html Basically this is the list of files inside: buildVersionHistory.plist index.xml (document) QuickLook QuickLook/Thumbnail.jpg (JPEG rendering of document) QuickLook/Preview.pdf (PDF rendering of document) thumbs thumbs/PageCapThumbV2-1.tiff (weird TIFF image, not document) There are some samples hidden behind a login (use bugmenot.com) here: https://www.stocklayouts.com/Templates/Free-Templates/Free-Sample-Apple-iWork-Pages-Template-Design.aspx Probably some more samples here: http://www.brighthub.com/computing/mac-platform/articles/109380.aspx
Yes, this is something really needed as there is almost no program available that can read/convert this format in Windows / Linux platforms. There is already support for rudimentary reading of Keynote documents with libetonyek library so it might make sense (assuming the formats have similar structure) to either extend this library or make a new one using libetonyek as base. As we generally need to reverse engineer the format it would be helpful to attach simple example documents made with iWork Pages for those who want to look at the format and don't have access to OSX (for example a simple document with one or two paragraphs, a table and a picture would be a good start).
libetonyek master has already got BIPU support for Pages (and Numbers too, but it just detects the format). The filter is on the GSoC ideas list.
I don't have access to OSX so I can't attach any documents but you can find many samples on the Internet, I linked to some in the initial report. The format is pretty simple and XML so I wouldn't say reverse engineering would be needed. There is a project to convert Apple Pages to epub here: https://github.com/immateriel/pages2epub/ There is a project to modify Keynote files here: https://github.com/undees/snippetize
(In reply to comment #3) > The format is pretty simple You only think that because you have never seen a document with complex layout, charts, tables and other objects in it. > and XML so I wouldn't say reverse engineering > would be needed. You would be wrong. That the format is XML does not mean it is immediately obvious what the elements/attributes and their values mean. And how they work together. E.g., the parameters for parametrized shapes (stars, arrows, quote bubbles, etc.) are saved as opaque numeric values. Or take tables. Most of the table-related elements and attributes are 1-3 letter abbreviations. What is even "better": only non-empty cells are actually saved. The position of the next filled cell (i.e., row and column) is saved, inventively, using a single attribute. (Of course, first you need to discover that this attribute--which is called sf:ct--has anything to do with the placement of cells in the grid...)
is this a duplicate of Bug 35361? could anybody double check?
(In reply to comment #5) > is this a duplicate of Bug 35361? No, it is not.
An initial support for Pages <= 4 has been added.