Bug 164659 - ASCII-only characters in CSV cause detection of codepage 865/Nordic
Summary: ASCII-only characters in CSV cause detection of codepage 865/Nordic
Status: UNCONFIRMED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: filters and storage (show other bugs)
Version:
(earliest affected)
25.8.0.0 alpha0+
Hardware: All All
: medium minor
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: CSV
  Show dependency treegraph
 
Reported: 2025-01-10 15:23 UTC by Eyal Rozenberg
Modified: 2025-01-11 22:57 UTC (History)
2 users (show)

See Also:
Crash report or crash signature:


Attachments
CSV which exhibits the problem when being loafded (27 bytes, text/csv)
2025-01-10 15:23 UTC, Eyal Rozenberg
Details
Screenshot import options (37.42 KB, image/png)
2025-01-11 22:57 UTC, m_a_riosv
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eyal Rozenberg 2025-01-10 15:23:39 UTC
Created attachment 198470 [details]
CSV which exhibits the problem when being loafded

Consider the following CSV file:

cola,colb,colc
1,2,3
4,5,6

(also attached). When I try opening it in Calc, the dialog which comes up suggests to me the codpage "Western Europe (DOS/OS2-865/Nordic)".

I woud expect one of "System", "Unicode (UTF-8)", or  "Western Europe (ISO 8859-I)". Bukd:


Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 2305fe302e12c4256e452589e2533772d4213e59
CPU threads: 4; OS: Linux 6.6; UI render: default; VCL: gtk3
Locale: en-IL (en_IL); UI: en-US
Calc: threaded
Comment 1 BogdanB 2025-01-10 15:42:29 UTC
I get: Western Europe (ISO 8859-1)
Version: 25.2.0.1 (X86_64) / LibreOffice Community
Build ID: ddb2a7ea3a8857aae619555f1a8743e430e146c9
CPU threads: 16; OS: Linux 6.8; UI render: default; VCL: gtk3
Locale: ro-RO (ro_RO.UTF-8); UI: en-US
Calc: threaded


The same in
Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 8b2f6b4fba6c466ed399f4f4b80e9631f13a6232
CPU threads: 16; OS: Linux 6.8; UI render: default; VCL: gtk3
Locale: ro-RO (ro_RO.UTF-8); UI: en-US
Calc: threaded
Comment 2 m_a_riosv 2025-01-11 01:01:24 UTC
I think Calc does not select the code page, but uses the last one used.

Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: 89c00618b9cee6e786fd11a7fdbf7aaf24e4fbb7
CPU threads: 16; OS: Windows 11 X86_64 (build 26100); UI render: Skia/Vulkan; VCL: win
Locale: en-US (es_ES); UI: en-US
Calc: CL threaded
Comment 3 Eyal Rozenberg 2025-01-11 09:08:25 UTC
(In reply to m_a_riosv from comment #2)
> I think Calc does not select the code page, but uses the last one used.

No, it does seem to apply some detection logic. If I put some Hebrew characters in UTF-8 encoding, that's what Calc picks up.

Also, even if that were the case - I've never opened any CSV files with Nordic text, so it was wrong the first time it detected that...
Comment 4 m_a_riosv 2025-01-11 22:57:30 UTC
Created attachment 198497 [details]
Screenshot import options

Reloading in save mode, doesn't show what you get.
Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community
Build ID: c9ae567c791bcffdc3fff9e3fb11b46275a13d2b
CPU threads: 16; OS: Windows 11 X86_64 (build 26100); UI render: default; VCL: win
Locale: es-ES (es_ES); UI: es-ES
Calc: threaded