Bug 128234 - Hyphen replaced by invalid characters in Calc
Summary: Hyphen replaced by invalid characters in Calc
Status: CLOSED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-10-18 16:04 UTC by mike.jeays@rogers.com
Modified: 2019-10-18 17:25 UTC (History)
0 users

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mike.jeays@rogers.com 2019-10-18 16:04:16 UTC
I have a test Calc sheet with a date in column A (2019-10-18) and a text field (Test field - with hyphen) in column B containing a hyphen. When I export it as a CSV file, the hyphen in the text field (but not in the date) is replaced by a b, nul, and a dc3. All three hyphens were entered with the normal hyphen key, but the one in the text field looks different.

europa 2004 ~ $ od -a Test.csv
0000000   2   0   1   9   -   1   0   -   1   8   ,   T   e   s   t  sp
0000020   f   i   e   l   d  sp   b nul dc3  sp   w   i   t   h  sp   h
0000040   y   p   h   e   n  nl
0000046
europa 2005 ~

europa 2006 ~ $ cat Test.csv
2019-10-18,Test field – with hyphen
europa 2007 ~ $
Comment 1 Eike Rathke 2019-10-18 17:25:01 UTC
That is not a U+002D HYPHEN-MINUS but a U+2013 EN DASH instead, exported to UTF-8 text encoding. Not a bug.

You maybe had some auto-correction replacement active when entering the data.

Using od -a option is a bad choice as it ignores the high-order bit. Better would be od -t x1c Test.csv where you would see that the actual bytes written are 0xe2 0x80 0x93, the UTF-8 encoded EN DASH.