Bug 82644 - FILESAVE Writes invalid UTF-16 CSV file, uses single byte tabs
Summary: FILESAVE Writes invalid UTF-16 CSV file, uses single byte tabs
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
Version:
(earliest affected)
4.3.0.4 release
Hardware: All All
: medium major
Assignee: David Tardon
URL:
Whiteboard: target:4.4.0 target:4.3.1
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-15 03:15 UTC by Dan Weiss
Modified: 2014-08-15 19:38 UTC (History)
1 user (show)

See Also:
Crash report or crash signature:


Attachments
Tiny example file which demonstrates the bug if opened and resaved. (42 bytes, text/plain)
2014-08-15 03:15 UTC, Dan Weiss
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Weiss 2014-08-15 03:15:45 UTC
Created attachment 104648 [details]
Tiny example file which demonstrates the bug if opened and resaved.

Libreoffice Calc is writing an invalid UTF-16 CSV file, which looks like garbage when viewed in any text editor.  This is because it is writing tabs as single '09' bytes instead of '09 00' 16-bit characters.

Steps to reproduce:

Create a tab-separated text file which is encoded in UTF-16 with BOM.
Open it in Calc
Edit some text
Save the file
File is now corrupt, instead of writing 16-bit characters for tabs, it wrote single 09 bytes instead, despite that the rest of the text file was saved properly.  It can still be repaired with a hex editor by replacing 09 bytes with 09 00.
Comment 1 Dan Weiss 2014-08-15 04:49:55 UTC
One more detail, Tabs are not the only control character that gets written incorrectly this way.  Sometimes a carriage return or linefeed is also written as a single byte.
Comment 2 Commit Notification 2014-08-15 18:30:09 UTC
David Tardon committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=92240691d1c11a003474a322596fcd1ac3513eb5

fdo#82644 write sal_Unicode chars as Unicode



The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 3 Commit Notification 2014-08-15 18:35:03 UTC
David Tardon committed a patch related to this issue.
It has been pushed to "libreoffice-4-3":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=694766652971c7fa26becd495925dd7ba0ddee7d&h=libreoffice-4-3

fdo#82644 write sal_Unicode chars as Unicode


It will be available in LibreOffice 4.3.2.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.
Comment 4 Commit Notification 2014-08-15 19:38:39 UTC
David Tardon committed a patch related to this issue.
It has been pushed to "libreoffice-4-3-1":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=cef219d07557bfeff256e78c583a20912d3d209f&h=libreoffice-4-3-1

fdo#82644 write sal_Unicode chars as Unicode


It will be available already in LibreOffice 4.3.1.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds
Affected users are encouraged to test the fix and report feedback.