Created attachment 78171 [details] Sample UTF_16le Chinese text file Problem description: When opening a Unicode text file encoded UTF-16LE, the character encoding denoted by the 'Byte Order Mark' does not seem to be recognised correctly on LO Linux. However, on the same version on Windows Platform, encoding is recognised. Steps to reproduce: 1. Open attached sample UTF-16le text file using Writer in LO4.0.2 Linux. Use the default 'all files' filter. The file encoding is not correctly recognised (garbage characters shown). 2. Open attached sample text file using Writer, but preselect the 'Text Encoded' filter first. Select 'Unicode' as encoding and 'CR+LF' as line separator. Characters shown correctly. 3. If opening the file in Calc, the encoding is correctly detected as Unicode. 4. If opening the file in Writer 4.0.2 on Windows, encoding is correctly detected and file displays correctly. The first two bytes of the attached file are the UTF-16le 'byte order mark' <FF> <FE>. But it seems LO4.0.2 Writer on Linux doesn't recognise these automatically. 'Language Settings' in preferences don't seem to make any difference. Operating System: Linux (Other) Version: 4.0.2.2 release
I can confirm this using Version 4.0.2.2 (Build ID: 4c82dcdd6efcd48b1d8bba66bfe1989deee49c3) under both Windows 7 Home Premium and Ubuntu 10.04 x86_64. Behaviour is as described. Setting status to NEW.
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present on a currently supported version of LibreOffice (4.4.1.2 or later): https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the version of LibreOffice and your operating system, and any changes you see in the bug behavior If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a short comment that includes your version of LibreOffice and Operating System Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) Thank you for your help! -- The LibreOffice QA Team This NEW Message was generated on: 2015-03-03
The issue still exists in LO 4.4.1.2 (4.4.1.2 Arch Linux build-1) en_GB locale on Linux. The 'select encoding' dialog now allows a choice of UTF-7, UTF-8, UTF-16 (UTF-16 works, with a Chinese-capable font selected). LO 4.4.1.2 on Windows still opens the file correctly using the 'all files' file dialog. Chris
taking.
LibreOffice should allowed to auto-recognise character encoding by checking the BOM. Some informations for BOM here: https://en.wikipedia.org/wiki/Byte_order_mark
Additionaly, if a TXT file does not have BOM, then LibreOffice should provide an interface to let user choose a proper encoding to view, this interface should also include a preview pane.
Maxim Monastirsky committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=6ff57262f44843ccd1f320426984b5e074e3eaf1 tdf#63673 Never ignore detected BOM It will be available in 5.4.0. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback.
Hello Maxim Is this bug fixed? If so, could you please close it as RESOLVED FIXED?
*** Bug 112069 has been marked as a duplicate of this bug. ***
Yes, this is fixed, verified in 5.4.1. 版本:5.4.1.2 (x64) Build ID:ea7cb86e6eeb2bf3a5af73a8f7777ac570321527 CPU 线程:4; 操作系统:Windows 6.19; UI 渲染:默认; 区域语言:zh-CN (zh_CN); Calc: group