Bug Hunting Session
Bug 61703 - Writer does not ask the character set of a .txt file at opening
Summary: Writer does not ask the character set of a .txt file at opening
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
3.6.5.2 release
Hardware: Other All
: medium major
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
: 106767 (view as bug list)
Depends on:
Blocks: Save-Text
  Show dependency treegraph
 
Reported: 2013-03-02 18:11 UTC by Zoltán Hegedüs
Modified: 2018-10-14 03:52 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:


Attachments
Writer uses huge memory at this file. This is a little-endian Unicode file without BOM. 3.5.5.3. asks the code page, 3.6.5.2. do not, and opens this badly. (351.91 KB, text/plain)
2013-03-02 18:11 UTC, Zoltán Hegedüs
Details
The all 65536 unicodes in order, little-endian, without BOM. 3.5.5.3. asks the code page, 3.6.5.2. do not, and opens this badly. (128.00 KB, text/plain)
2013-03-02 18:14 UTC, Zoltán Hegedüs
Details
The all 256 codes in order, 3.5.5.3. opens this in Calc, and asks the code page, 3.6.5.2. opens this in Writer, badly. (256 bytes, text/plain)
2013-03-02 18:17 UTC, Zoltán Hegedüs
Details
Codes 27-255 in ordet, both version opens this in Writer, and do not ask the code page. (229 bytes, text/plain)
2013-03-02 18:18 UTC, Zoltán Hegedüs
Details
An ASCII table, 3.5.5.3. asks the code page, but opens this in Calc. 3.6.5.2. do no asks teh code page, and opens this in Writer, but always in code page 1250. (697 bytes, text/plain)
2013-03-02 18:21 UTC, Zoltán Hegedüs
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zoltán Hegedüs 2013-03-02 18:11:39 UTC
Created attachment 75793 [details]
Writer uses huge memory at this file. This is a little-endian Unicode file without BOM. 3.5.5.3. asks the code page, 3.6.5.2. do not, and opens this badly.

When I open a .txt file, the Writer never asks the character set. So I tried it with portable 3.5.5.3. This asked sometimes, but this opened the file in Writer only some times, regularly in Calc.
Another example: it is possible that an 1 byte/character .txt file started with FF FE or FE FF (BOM: byte order mark: Unicode is big endian or little endian).
I tried to open a .bmp file in Writer. I want to see what I see when I open the .bmp file in Notepad. I used the Open menu item in Writer, but file always opened in Draw. If a text file has bad extension, it is impossible to open without renaming the file. Other example: a .txt file has extension .ods.
There is no ISO-8859-16 in the code page list (3.5.5.3. portable, Writer/Calc).
When I opened the attached utable.txt (this is a little-endian Unicode file without BOM), Writer used more than 512 MB of memory. While I was editing it, Writer used more than 1 GB of memory (I have 1 GB really + 1 GB virtual memory). This error reported separately.
Writer can not save in big endian Unicode, and this is not in the list when opening: if there is no BOM, Writer opens this badly.
Comment 1 Zoltán Hegedüs 2013-03-02 18:14:02 UTC
Created attachment 75794 [details]
The all 65536 unicodes in order, little-endian, without BOM. 3.5.5.3. asks the code page, 3.6.5.2. do not, and opens this badly.
Comment 2 Zoltán Hegedüs 2013-03-02 18:17:29 UTC
Created attachment 75795 [details]
The all 256 codes in order, 3.5.5.3. opens this in Calc, and asks the code page, 3.6.5.2. opens this in Writer, badly.
Comment 3 Zoltán Hegedüs 2013-03-02 18:18:38 UTC
Created attachment 75796 [details]
Codes 27-255 in ordet, both version opens this in Writer, and do not ask the code page.
Comment 4 Zoltán Hegedüs 2013-03-02 18:21:05 UTC
Created attachment 75797 [details]
An ASCII table, 3.5.5.3. asks the code page, but opens this in Calc. 3.6.5.2. do no asks teh code page, and opens this in Writer, but always in code page 1250.
Comment 5 Zoltán Hegedüs 2013-03-03 12:58:46 UTC
There is "encoded text" type at opening, but some problem remained:

There is no ISO-8859-16 on the list.
There is no big-endian Unicode on the list: if there is no BOM, the program can not recognize this.
Writer can not open these files correctly, and can not save files in these formats.
Comment 6 Thomas van der Meulen 2013-06-21 08:37:05 UTC
Thank you for jour bug report, I can reproduce this bug running LibreOffice Version: 4.1.0.1
Build ID: 1b3956717a60d6ac35b133d7b0a0f5eb55e9155 on mac os x 10.6.8.

I can see that the files are't getting inported correcly. there is a lot of '#'.
Comment 7 QA Administrators 2015-04-19 03:20:28 UTC Comment hidden (obsolete)
Comment 8 Buovjaga 2015-06-15 12:07:29 UTC
(In reply to Zoltán Hegedüs from comment #0)
> Created attachment 75793 [details]
> Writer uses huge memory at this file. This is a little-endian Unicode file
> without BOM. 3.5.5.3. asks the code page, 3.6.5.2. do not, and opens this
> badly.

It didn't ask, but memory consumption wasn't huge, 56 megs.

Win 7 Pro 64-bit Version: 5.1.0.0.alpha1+
Build ID: 01a189abcd9a4ca472a74b3b2c000c9338fc2c91
TinderBox: Win-x86@39, Branch:master, Time: 2015-06-14_07:46:28
Locale: fi-FI (fi_FI)
Comment 9 QA Administrators 2016-09-20 10:01:01 UTC Comment hidden (obsolete)
Comment 10 m.a.riosv 2017-01-18 12:55:36 UTC
Please take a look to:

https://bugs.documentfoundation.org/show_bug.cgi?id=105408#c1
Comment 11 Aron Budea 2017-03-25 21:11:40 UTC
*** Bug 106767 has been marked as a duplicate of this bug. ***
Comment 12 QA Administrators 2018-06-18 02:41:51 UTC
** Please read this message in its entirety before responding **

To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year.

There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present.

If you have time, please do the following:

Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/

If the bug is present, please leave a comment that includes the information from Help - About LibreOffice.
 
If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice.

Please DO NOT

Update the version field
Reply via email (please reply directly on the bug tracker)
Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not 
appropriate in this case)


If you want to do more to help you can test to see if your issue is a REGRESSION. To do so:
1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/

2. Test your bug
3. Leave a comment with your results.
4a. If the bug was present with 3.3 - set version to 'inherited from OOo';
4b. If the bug was not present in 3.3 - add 'regression' to keyword


Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa

Thank you for helping us make LibreOffice even better for everyone!

Warm Regards,
QA Team

MassPing-UntouchedBug