Bug 104927 - Text Import - fixed width mode not adjusting csvtablebox for multi-byte fonts
Summary: Text Import - fixed width mode not adjusting csvtablebox for multi-byte fonts
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Calc (show other bugs)
(earliest affected) release
Hardware: x86-64 (AMD64) Windows (All)
: medium normal
Assignee: Not Assigned
Whiteboard: target:7.2.0
Depends on:
Blocks: CJK CSV-Import
  Show dependency treegraph
Reported: 2016-12-26 06:57 UTC by Dragon Chuang
Modified: 2021-01-22 09:26 UTC (History)
7 users (show)

See Also:
Crash report or crash signature:

Text File, data width 58 bytes each line (40.43 KB, image/png)
2016-12-26 06:57 UTC, Dragon Chuang
Calc Text Import wrong data width (54.59 KB, image/png)
2016-12-26 06:58 UTC, Dragon Chuang
Import file (fixed width CSV) (1.52 KB, text/plain)
2016-12-27 04:03 UTC, Dragon Chuang
Excel Text Import (26.95 KB, image/png)
2016-12-27 04:15 UTC, Dragon Chuang

Note You need to log in before you can comment on or make changes to this bug.
Description Dragon Chuang 2016-12-26 06:57:48 UTC
Created attachment 129941 [details]
Text File, data width 58 bytes each line

Text Import used incorrect data width in non-ascii character set.

It should not be character but byte.
In DBCS (Chinese, Japanese) one character used two bytes.

Thank you.

Text File, data width 58 bytes each line.
Calc Text Import, data width 33 bytes(?) each line.
Comment 1 Dragon Chuang 2016-12-26 06:58:52 UTC
Created attachment 129942 [details]
Calc Text Import wrong data width
Comment 2 Xisco Faulí 2016-12-26 12:46:34 UTC
Hello Dragon,

Thank you for reporting the bug. Please attach a sample document, as this makes it easier for us to verify the bug. 
I have set the bug's status to 'NEEDINFO'. Please change it back to 'UNCONFIRMED' once the requested document is provided.
(Please note that the attachment will be public, remove any sensitive information before attaching it. 
See https://wiki.documentfoundation.org/QA/FAQ#How_can_I_eliminate_confidential_data_from_a_sample_document.3F for help on how to do so.)
Comment 3 V Stuart Foote 2016-12-26 15:12:34 UTC
No issue with fixed column import of utf-8 encoded CJK fonts with
Version: (x64)
Build ID: 9b50003582f07ac674d6451e411e9b77cccd2b22
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
Locale: en-US (en_US); Calc: group

Please post your sample document with the Windows-950 encoding.

And personally, with that data set I would reduce the white space and use a "space" (or replacement while editing)) for delimited CSV import.
Comment 4 Dragon Chuang 2016-12-27 04:03:18 UTC
Created attachment 129955 [details]
Import file (fixed width CSV)

Text file for import.
Comment 5 Dragon Chuang 2016-12-27 04:15:30 UTC
Created attachment 129956 [details]
Excel Text Import

I can not use a "space" for delimited CSV import, because source file is fixed width. Some "space" (column) is no data, not delimitation.

(Microsoft Excel correctly import this file.)
Comment 6 V Stuart Foote 2016-12-27 15:15:22 UTC

The "fixed width" Big5 encoded Traditional Chinese sample document, or similar in utf-8, are not correctly handled by the csvtablebox GUI.

The ruler and column selection do not adjust char to handle multi-byte characters, so the column positions for the "fixed width" import is corrupt and can not be set.

Actual import does honor the encoding and fielding as set--but the GUI (ruler and grid) have wrong layout so impossible to correctly set column widths.

Testing on Windows 10 Pro 64-bit en-US with
Version: (x64)
Build ID: 9b50003582f07ac674d6451e411e9b77cccd2b22
CPU Threads: 8; OS Version: Windows 6.19; UI Render: default; 
Locale: en-US (en_US); Calc: group

On open into Calc, document correctly triggers Text Import dialog and is detected as Chinese Traditional and fixed width, but font encoding is initially identified as utf-8, change that to Chinese Traditional (Big5) and glyphs are correctly rendered to the GUI.

Comment 7 Caolán McNamara 2017-02-01 21:24:39 UTC
I don't think its anything to do with double byte fonts or encodings, just that we're assuming that the cjk font width is the same as the western font width
Comment 8 QA Administrators 2018-07-21 02:40:55 UTC Comment hidden (obsolete)
Comment 9 Dragon Chuang 2018-08-13 07:18:17 UTC
The bug still present.
It can be reproduced at LibreOffice version (x64).
Build ID:efb621ed25068d70781dc026f7e9c5187a4decd1
Comment 10 Commit Notification 2021-01-09 04:14:22 UTC
Mark Hung committed a patch related to this issue.
It has been pushed to "master":


tdf#104927 consider character width for CSV import

It will be available in 7.2.0.

The patch should be included in the daily builds available at
https://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:

Affected users are encouraged to test the fix and report feedback.