I wrote some text in Writer in Cyrillic and saved it in "Text (.txt)", not "Text Encoded (.txt)". After opening the file again in Writer it displayed only question marks. I opened the file in a hex editor and indeed on the place of the Cyrillic letters, there were only question marks in ANSI charset encoding. It seams that Writer has totally disregarded the presence of non ANSI characters. What should happen in that case, is to save it in some format that can represent them, like utf8.
sogartary: for the test, could you give a try to brand new 4.0.3?
The bug is still present in 4.0.3.3.
sogartary: could you rename your LO directory profile (see https://wiki.documentfoundation.org/UserProfile) and give it a new try? The goal is to be sure it's not due to customization or something.
I have tried what you said. Renamed the %appdata%\LibreOffice\4\user directory and retried again. The bug still persists.
I also tried to reproduce the bug on Ubuntu 12.04, agian with LibreOffice 4.0.3.3. It seams that there everything is Ok.
sogartary: when comments 2, 4 and 5 read together, I don't understand. Is it ok or not with 4.0.3 and a brand new LO profile? If it's not the case, which case is ok?
Dear Bug Submitter, Please read the entire message before proceeding. This bug has been in NEEDINFO status with no change for at least 6 months. Please provide the requested information as soon as possible and mark the bug as UNCONFIRMED. Due to regular bug tracker maintenance, if the bug is still in NEEDINFO status with no change in 30 days the QA team will close the bug as INVALID due to lack of needed information. For more information about our NEEDINFO policy please read the wiki located here: https://wiki.documentfoundation.org/QA/FDO/NEEDINFO If you have already provided the requested information, please mark the bug as UNCONFIRMED so that the QA team knows that the bug is ready to be confirmed. Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team
Dear Bug Submitter, Please read this message in its entirety before proceeding. Your bug report is being closed as INVALID due to inactivity and a lack of information which is needed in order to accurately reproduce and confirm the problem. We encourage you to retest your bug against the latest release. If the issue is still present in the latest stable release, we need the following information (please ignore any that you've already provided): a) Provide details of your system including your operating system and the latest version of LibreOffice that you have confirmed the bug to be present b) Provide easy to reproduce steps – the simpler the better c) Provide any test case(s) which will help us confirm the problem d) Provide screenshots of the problem if you think it might help e) Read all comments and provide any requested information Once all of this is done, please set the bug back to UNCONFIRMED and we will attempt to reproduce the issue. Please do not: a) respond via email b) update the version field in the bug or any of the other details on the top section of FDO
Even Notepad warns about data loss when the file contains Unicode characters.
This is Windows-only. If you don't use the file format "Text - Choose Encoding", the file will have question marks in place of non-ASCII characters. Version: 7.2.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 9df3aa7ea72d61462e430643f2a80906dce4e15b CPU threads: 2; OS: Windows 10.0 Build 19042; UI render: Skia/Raster; VCL: win Locale: fi-FI (fi_FI); UI: en-US Calc: threaded Jumbo
(In reply to Urmas from comment #9) > Even Notepad warns about data loss when the file contains Unicode characters. As well as LibreOffice warns when anyone saves anything to TXT, because there will be inevitable loss - of formatting; of graphics; of metadata; of information in headers/footers. No need in additional warnings about "also, not all characters present in document *body* are representable in selected encoding's charset". Of course this is not Windows-only; it would appear on any platform with system encoding being non-Unicode. However, on other platforms it's *usual* to use UTF-8. But there exist Linux systems using e.g. KOI-8R, etc. This is NOTABUG.
On the other hand, why not change our "simple text" export to use UTF-8 (with BOM) instead of "system encoding" (unless there's existing encoding information from import - see tdf#120574, which is a different problem)? That would avoid this situation; UTF-8 is universal standard now, with much better chances to be correctly read than any other encoding. So should this become "Use UTF-8 instead of system encoding in text export filter by default"?