Bug 149477 - XLSX parser/generator ignores SpreadsheetML string encoding
Summary: XLSX parser/generator ignores SpreadsheetML string encoding
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected)
3.3.0 release
Hardware: All All
: medium normal
Assignee: Not Assigned
Whiteboard: QA:needsComment
Keywords: filter:xlsx
Depends on:
Blocks: XLSX
  Show dependency treegraph
Reported: 2022-06-07 12:00 UTC by Daniel Rentz
Modified: 2022-08-26 19:14 UTC (History)
3 users (show)

See Also:
Crash report or crash signature:
Regression By:

test file for import (40.05 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2022-06-07 12:01 UTC, Daniel Rentz

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Rentz 2022-06-07 12:00:10 UTC
Follow-up for bug 118470

SpreadsheetML parts encode most ASCII control characters (x00-x08 and x0a-x1f) with "_xhhhh_" hex sequences (e.g. "\x1f" => "_x001F_"). Additionally, if such hex sequences appear literally, they will be encoded by replacing the leading underscore character, e.g. "_001f_" => "_005F_001f_". This applies to almost all strings that can appear in an XLSX file (workbook.xml, sheet.xml, sharedStrings.xml, table.xml, etc etc).

LO Calc needs to decode all these strings when loading XLSX, and furthermore *needs to encode* all strings when writing XLSX.

The following contents are affected (among others):
- cell content string (xl/sharedStrings.xml)
- cell formula (xl/worksheets/sheetN.xml)
- sheet name (xl/workbook.xml)
- cell style name, font name, number format code (xl/styles.xml)
- cell hyperlink ("#location" part)
- named ranges (name + formula)
- tables (xl/tables/table1.xml)
- auto-filter/table filter: filter entries
- data validation: formulas, string list, error title/text, prompt title/text
- conditional formatting: formulas (comparison, color steps, databar min/max, iconset steps), text rules
- cell notes (xl/commentsN.xml)
- threaded comments (xl/threadedComments/threadedCommentN.xml)
Comment 1 Daniel Rentz 2022-06-07 12:01:19 UTC
Created attachment 180623 [details]
test file for import