Bug 87836 - In Calc, Opening a {Tab} separated fields file (.tsv or .csv) brings it in as Oriental text of some ty pe
Summary: In Calc, Opening a {Tab} separated fields file (.tsv or .csv) brings it in as...
Status: RESOLVED NOTABUG
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: LibreOffice (show other bugs)
Version:
(earliest affected)
4.3.5.2 release
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-29 17:52 UTC by Henry
Modified: 2021-01-19 13:20 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Henry 2014-12-29 17:52:36 UTC
A spreadsheet file with tab separated variables/fields, e.g.
$ od -a tt.csv 
0000000   1  ht   2  nl   3  ht   4  nl
when opened in Calc 4.3.5.2 shows up as just cell 1A in Lohit Hindi

Calc Tools -> Options -> Language Settings -> Languages
shows everything as Default - English (USA)

This language shift problem doesn't occur if I use the .ods file type.
Comment 1 Henry 2014-12-29 18:06:57 UTC
Arrgh! 

It was set for Unicode (UTF-16) apparently as the installation default.

I changed it to UTF-8 and now all is well.
Comment 2 A (Andy) 2014-12-29 19:39:54 UTC
Can we therefore close this issue?
Comment 3 Henry 2014-12-29 19:59:48 UTC
Yes. I think it's settled - but people should be aware that installation can be set for UTF-16 (unusual in my experience) and they need to change to UTF-8.
Comment 4 David Tardon 2015-01-09 09:21:18 UTC
UTF-16 is not an "installation default". In fact, in 4.3.5 the encoding defaults to UTF-8 even if system encoding is UTF-16 (Windows). The only way to make an encoding stick is to select it explicitly. The easiest way is not to touch the encoding selection at all and rely on detection--that works fine for UTF-16/UTF-8 anyway.
Comment 5 Jonny Grant 2021-01-19 13:20:02 UTC
I still see this occasionally with v6.4.6.2
I need to manually change back to UTF-8 when I see the Chinese text in the preview.


There's clearly a bug in the way it determines the file format.