Description: When you run "Text to Columns" on characters that contain full-width spaces, the delimiter is not detected. If you select multiple rows and run it, the preview at the bottom of the dialog will be garbled. (This can also happen with a single cell.) If the preview is garbled, selecting "Separated by" from the separator options and specifying a full-width space as the separator will not change the preview, and pressing the OK button will simply replace the contents of the first cell with garbled characters. Steps to Reproduce: 1.Enter characters including full-width spaces 2.Select any rows 3.Data - Text to Columns... Actual Results: The preview field is garbled Expected Results: No garbled characters Reproducible: Sometimes User Profile Reset: No Additional Info: It works fine if you change the delimiter from a full-width space to a half-width space or a comma. Version: 25.2.2.2 (X86_64) / LibreOffice Community Build ID: 7370d4be9e3cf6031a51beef54ff3bda878e3fac CPU threads: 4; OS: Linux 6.8; UI render: default; VCL: gtk3 Locale: ja-JP (ja_JP.UTF-8); UI: en-US Calc: threaded Version: 25.2.2.2 (X86_64) / LibreOffice Community Build ID: 7370d4be9e3cf6031a51beef54ff3bda878e3fac CPU threads: 4; OS: Windows 10 X86_64 (10.0 build 19045); UI render: Skia/Raster; VCL: win Locale: ja-JP (ja_JP); UI: ja-JP Calc: CL threaded Version: 25.8.0.0.alpha0+ (X86_64) / LibreOffice Community Build ID: eb4977cb6d81b1c15d025435adf25b19e88d3132 CPU threads: 4; OS: Linux 6.8; UI render: default; VCL: gtk3 Locale: ja-JP (ja_JP.UTF-8); UI: ja-JP Calc: threaded works fine Version: 24.8.6.2 (X86_64) / LibreOffice Community Build ID: 6d98ba145e9a8a39fc57bcc76981d1fb1316c60c CPU threads: 4; OS: Linux 6.8; UI render: default; VCL: gtk3 Locale: ja-JP (ja_JP.UTF-8); UI: ja-JP Calc: threaded
Created attachment 200463 [details] text2column-sample-ja I've attached a sample for verification.
bisected author Gabriel Masei commit 565b619d57a3b98b0826c4b49dee6606f9ae70e0 tdf#160582 Preserve settings saving in csv import dialog Also, improve detection algorithm by replacing the limit of 20 lines with a time limit of 500ms. Change-Id: Iac519b6ebe675b91ce84b900646d9d320ea9ddc1 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165905
Created attachment 200478 [details] Text to column on A1 on Linux and A2 on Windows What do you mean by garbled ? Can you show us a screenshot? For me, on Linux and Windows, when go to Text on column on A1, the preview shows Chinese characters. Is this bug report about that? On windows, the rendering of the Japanese characters look low resolution. I guess that that's another issue Version: 25.2.2.2 (X86_64) / LibreOffice Community Build ID: 7370d4be9e3cf6031a51beef54ff3bda878e3fac CPU threads: 8; OS: Linux 6.11; UI render: default; VCL: gtk3 Locale: en-US (en_US.UTF-8); UI: en-US Flatpak Calc: threaded
(In reply to opp from comment #3) > What do you mean by garbled ? Can you show us a screenshot? Same as your screenshot. Isn't it garbled to say that the situation where the A1 cell 'あいう えおか' is displayed in Chinese-like characters is garbled. Since it relies on machine translation, the words may not be appropriate.
Created attachment 200488 [details] screenshots I think there is a bug because when I select two lines and run it, one line becomes unreadable text.
(In reply to Saburo from comment #4) > Isn't it garbled to say that the situation where the A1 cell 'あいう えおか' is > displayed in Chinese-like characters is garbled. I don't know I confirm the bug
The supposedly garbled string appears to have undergone a byte swap between the high and low bytes of each Unicode code point. Using Attachment 200478 [details] as an example: - あいう えおか -> U+3042 U+3044 U+3046 U+3000 U+3048 U+304A U+304B - 䈰䐰䘰0䠰䨰䬰 -> U+4230 U+4430 U+4630 U+0030 U+4830 U+4A30 U+4B30 This clearly shows that each code point has had its bytes reversed, suggesting an endian-related issue in how the text is being handled internally.