Bug Hunting Session
Bug 38902 - UTF-8 contents should be detected and this codepage should be suggested for FILESAVE as ".txt coded"
Summary: UTF-8 contents should be detected and this codepage should be suggested for F...
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.3.3 release
Hardware: x86 (IA32) All
: high major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords: dataLoss
: 93907 (view as bug list)
Depends on:
Blocks: Save-Text
  Show dependency treegraph
 
Reported: 2011-07-02 01:27 UTC by Urmas
Modified: 2018-06-18 02:42 UTC (History)
6 users (show)

See Also:
Crash report or crash signature:


Attachments
Original (41 bytes, text/plain)
2011-07-03 18:21 UTC, Urmas
Details
Saved in LO (28 bytes, text/plain)
2011-07-03 18:22 UTC, Urmas
Details
Corrupted file (31 bytes, text/plain)
2011-07-03 22:25 UTC, Urmas
Details
Screenshot of opening dialog (66.52 KB, image/pjpeg)
2011-07-03 23:15 UTC, Urmas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Urmas 2011-07-02 01:27:26 UTC
1. Open UTF-8 file in Writer. See it in UTF.
2. Change something and save.
3. Open it again.

Problem: File was saved in local codepage and all letters was replaced with ?'s. As result the document is completely fucked up.
Comment 1 Don't use this account, use tml@iki.fi 2011-07-02 02:09:56 UTC
And how critical is it to not be able to use Libre*Office* as a plain text editor? On Windows even? Sheesh.
Comment 2 Urmas 2011-07-02 03:02:46 UTC
Since when bugs causing DATA LOSS are minor? Is that your private shop?
Comment 3 Don't use this account, use tml@iki.fi 2011-07-02 03:34:10 UTC
shop? it's my private opinion.
Comment 4 Rainer Bielefeld Retired 2011-07-03 06:42:48 UTC
NOT reproducible with "LibreOffice 3.4.1RC1 - WIN7  Home Premium (64bit) German UI [OOO340m1 (Build:103)]". Might have been fixed in between "somehow"?

@reporter:
May I ask you to read  hints on <http://wiki.documentfoundation.org/BugReport> carefully?
Then please:
- Write a meaningful Summary
- Attach a test kit  with utf8samplefile.txt and 
  utf8samplefilesavedfromlibo.txt 
- Attach screenshots with comments (you can add information using LibO DRAW
  and then attach your screenshot with comments as PDF) if necessary
- Contribute a step by step instruction containing every key press and every 
  mouse click how to reproduce your problem
- add information 
  -- concerning your PC 
  -- concerning your OS
  -- concerning your LibO localization (UI language)
  –- Libo settings that might be related to your problems 
  -- how you launch LibO and how you opened the sample document
  -- everything else crossing your mind after you read a.m. URL

Can you test with 3.4.1?
Comment 5 Urmas 2011-07-03 18:21:13 UTC
Created attachment 48720 [details]
Original
Comment 6 Urmas 2011-07-03 18:22:12 UTC
Created attachment 48721 [details]
Saved in LO
Comment 7 Urmas 2011-07-03 18:28:55 UTC
Either when I open it from Explorer or via File Open, any edit leads to corruption.
It is perfectly reproducible on 3.4.1, in this case with Russian UI on Windows XP.
Comment 8 Rainer Bielefeld Retired 2011-07-03 22:07:30 UTC
Still not reproducible wiht reporter's sample and with "LibreOffice 3.4.1RC1 - WIN7  Home Premium (64bit) German UI [OOO340m1 (Build:103)]" 

@Urmas:
Please do not touch the Bugzilla pickers if you do not know for what they are.
With what Version did you see the problem the first time?
Your description is far away from "every key press and every mouse click". without that and clear description of of the reactions of LibO that can't be checked. 

You should discuss the problem on a user mailing list and then report the results here.
Comment 9 Urmas 2011-07-03 22:24:47 UTC
3.2.2? I don't remember, but OOO340m1 (Build:103) on 32-bit XP shows same behaviour.

As for instruction:
1. File/Open, select the file.
2. Append "123" in the end.
3. Press Save button.
4. Confirm saving in same format.
Comment 10 Urmas 2011-07-03 22:25:33 UTC
Created attachment 48729 [details]
Corrupted file
Comment 11 Rainer Bielefeld Retired 2011-07-03 23:00:48 UTC
Might be the document has not been opened via coded text import or something else, no idea, no useful information available. Closing INVALID for now.

@Urmas:
Please excuse me, but such a report where you only show the result, but not the way how you got it is completely useless. It would help if you would follow my advice (read information CAREFULLY, discuss ...) instead of providing information fragments.

Please feel free to reopen this bug when you can contribute requested additional information due to <http://wiki.documentfoundation.org/BugReport>. Possibly the best way might be that (after discussion on user mailing list) you create a presentation showing screenshots with comments taken after every mouse click and every key press.
Comment 12 Don't use this account, use tml@iki.fi 2011-07-03 23:14:32 UTC
You need to save as "Text Encoded" and choose the "Unicode (UTF-8)" character set.

Sure, LO could perhaps be clever enough to understand this when saving a document that contains characters not in the system codepage into a plain text file on Windows. Or should it? The expected encoding of "text" files on Windows isn't exactly well-defined.

Rainer, I am reopening this, this *is* a real problem.
Comment 13 Urmas 2011-07-03 23:15:08 UTC
Created attachment 48731 [details]
Screenshot of opening dialog
Comment 14 Urmas 2011-07-03 23:17:29 UTC
> The expected encoding of "text" files on Windows
> isn't exactly well-defined.

If it has opened it as UTF-8, it should save it as UTF-8, there cannot be two opinions.
Comment 15 Rainer Bielefeld Retired 2011-07-13 07:51:28 UTC
Related to "Bug 39124 - copy base table to CALC uses wrong codepage for paste"?
Comment 16 Rainer Bielefeld Retired 2012-07-24 17:51:54 UTC
[Reproducible] with reporter's sample and "LibreOffice 3.3.3  German UI/Locale [OOO330m19 (Build:301) tag libreoffice-3.3.3.1] on German WIN7 Home Premium (64bit), might be inherited from OOo?
Comment 17 Zoltán Hegedüs 2013-10-24 17:36:42 UTC
4.0.6.2. Release:
Open an UTF-8 file with Writer. I tried with a file what has UTF-8 header. The extension be .txt.
Modify some characters.
Save the file. Writer saves the file in a non-Unicode codepage.
If I save the file with Save as - Encoded text, subsequently normal saves (only save, not save as) will be good.
If I open the file as encoded text, there is no error. If the extension is unknown for Writer, I can open this only as encoded text, beacuse the normal opening opens this in Calc.
This can cause DATA LOSS, so I modified from normal to major.
Comment 18 QA Administrators 2015-04-01 14:40:08 UTC Comment hidden (obsolete)
Comment 19 Buovjaga 2015-04-19 15:38:23 UTC
Reproduced.

Win 7 Pro 64-bit Version: 5.0.0.0.alpha0+ (x64)
Build ID: 211c12b9c64facd1c12f637a5229bd6a6feb032a
TinderBox: Win-x86_64@42, Branch:master, Time: 2015-04-18_01:51:17
Locale: fi_FI
Comment 20 m.a.riosv 2015-12-26 18:12:32 UTC
*** Bug 96730 has been marked as a duplicate of this bug. ***
Comment 21 m.a.riosv 2015-12-26 18:17:57 UTC
*** Bug 93907 has been marked as a duplicate of this bug. ***
Comment 22 Justin L 2017-02-24 16:10:57 UTC
Unable to reproduce in Linux (went back to oldest50 alpha), so it might actually be windows only (as is already marked).  After making changes and then saving over top of an existing .txt file, or as a new Text(.txt) format, I never got ?'s when re-opening.

I was able to confirm the problem still exists in Windows with 5.1.6.
Comment 23 Buovjaga 2017-02-27 19:15:10 UTC
Still repro

Version: 5.4.0.0.alpha0+
Build ID: 54d5b1828ec73d0475e0ddb6e31394a7e1904a1b
CPU Threads: 4; OS Version: Windows 6.19; UI Render: default; 
TinderBox: Win-x86@42, Branch:master, Time: 2017-02-09_23:41:14
Locale: fi-FI (fi_FI); Calc: group
Comment 24 Urmas 2017-02-28 01:03:50 UTC
Same result on Linux with ISO-8859-1 locale.
Comment 25 QA Administrators 2018-06-18 02:42:19 UTC
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug