Bug 40067 - Calc does not preserve whitespace in strings in *.xlsx FILESAVE
Summary: Calc does not preserve whitespace in strings in *.xlsx FILESAVE
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
3.4.2 release
Hardware: All All
: medium normal
Assignee: Robin Kumar
URL:
Whiteboard: target:4.4.0
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-13 22:34 UTC by puhv
Modified: 2014-06-18 03:43 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
ODF-spreadsheet with leading/trailing whitespace; initially created *.xlsx; manually corrected *.xlsx (22.83 KB, application/x-zip-compressed)
2011-08-13 22:34 UTC, puhv
Details

Note You need to log in before you can comment on or make changes to this bug.
Description puhv 2011-08-13 22:34:46 UTC
Created attachment 50188 [details]
ODF-spreadsheet with leading/trailing whitespace; initially created *.xlsx; manually corrected *.xlsx

When saving a spreadsheet as an *.xlsx file, LibreOffice 3.4.2 Calc does not apply necessary attributes to strings beginning or ending with whitespace. As a result, such leading or trailing space characters will be trimmed away when such *.xlsx file will be opened.

The following are steps to reproduce this problem.

First, extract the contents of the attached "whitespace.zip" archive.

Then in Calc (in new empty LibO Spreadsheet document), use menu File -> Open (LibO dialog) - file type "Spreadsheets" -> select the extracted "normal.ods" document -> double-click.

The first row in the spreadsheet "Sheet1" contains six cells (A1 .. F1) with example strings. Note that the cell A1 contains text "aaa" followed by 25 space characters, the C1 contains 25 spaces followed by "ccc", the cell E1 contains just 25 space characters.

Now save the document: use menu File -> Save As... (LibO dialog) - choose file type either "Microsoft Excel 2007 XML (*.xlsx)" or "Office Open XML Spreadsheet (*.xlsx)" -> type some filename (e.g. "initial.xlsx") -> click Save, click "Keep Current Format" (if asked). Close Calc.

Depending on which of the file types was chosen, the resulting file will be essentially the same as either "initial-2007.xlsx" or "initial-ooxml.xlsx" (see files extracted from the attached archive).

Now open Calc again and use menu File -> Open (LibO dialog) - file type "All files" -> select the saved "initial.xlsx" document -> double-click.

Check the contents of the cells A1 and C1 now. All the whitespace got trimmed away. The new strings are just "aaa" and "ccc" respectively. The cell E1, which previously contained space characters only, is now blank (no text at all).

To temporarily work around this problem, i.e. to prevent losing this whitespace after saving "normal.ods" as *.xlsx, I have manually edited the contents of the part named "sharedStrings.xml" in the folder named "xl" in the initially saved files and supplied each t-element of the three endangered strings with an attribute xml:space="preserve" (so that the tag <t> becomes <t xml:space="preserve">), see the extracted files "manually-corrected-2007.xlsx" and "manually-corrected-ooxml.xlsx" from the attached archive.

Use menu File -> Open (LibO dialog) - file type "All files" -> select the extracted "manually-corrected-2007.xlsx" document -> double-click to open such a corrected file.

Note that the contents of the first row is now exactly the same as it was in the extracted "normal.ods" document (all whitespace is preserved). Calc has respected the xml:space attribute when opening the file.

If we save the corrected *.xlsx document as an *.ods (i.e. in the Open Document Format for Spreadsheets) the space characters will remain preserved. If, however, we instead re-save it in one of the *.xlsx formats then the xml:space attribute will not be retained while saving, so these spaces will be stripped again after opening.
Comment 1 sasha.libreoffice 2012-04-11 05:06:54 UTC
Reproduced in 3.5.2 on Fedora 64 bit. Calc saves to xlsx trimmed.
(opens xlsx with spaces correctly)
PS: msOffice 2007 saves to ods preserving spaces
Comment 2 Andreas Jansson 2013-05-07 09:57:37 UTC
Reproduced in LibreOffice version 4.0.2.2.
Comment 3 Commit Notification 2014-06-17 13:26:53 UTC
Robin Kumar committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=750269ee6ecc7456b005c5fafd096c47c4ecd02e

fdo#40067: Fix for importing white space in strings (XLSX).



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.