Created attachment 99068 [details] It is a SLK format with a .XLS extension. Somes files in a internal SLK format, but with a .XLS extension, are opened in Writer. Appears since 4.1, tested with 4.1.6.2. It worked with 3.5 Works fine if the file extension is .SLK
Confirmed with a master build from Tb39 (Build ID: 45c89d62b527abec07072074484bd596ab1aa04a). Can't reproduce with other SYLK files I have.
(In reply to comment #1) > Confirmed with a master build from Tb39 (Build ID: > 45c89d62b527abec07072074484bd596ab1aa04a). Can't reproduce with other SYLK > files I have. The bug reproduce with files .xls extension in SYLK format contain 2 acute accent charcaters ( é ). Work with 1 or 3 "é" characters.
To reproduce : -Create a new empty file with calc, and save it in SYLK format ( .SLK ). -Close the file, and rename it with .XLS extension. -Reopen with Calc, and write words in cells - with 1 "é" character, save sylk format and .xls extension, reopen = ok Calc - with 2 "é" charcaters, save sylk format and .xls extension, reopen = Writer. - change extension in ".slk", open = ok Calc, write a 3rd "é", save. - change extension in ".xls", open = ok Calc
To reproduce : -Create a new empty file with calc, and save it in SYLK format ( .SLK ). -Close the file, and rename it with .XLS extension. -Reopen with Calc, and write words in cells - with 1 "é" character, save sylk format and .xls extension, reopen = ok Calc - with 2 "é" characters, save sylk format and .xls extension, reopen = Writer. - change extension in ".slk", open = ok Calc, write a 3rd "é", save. - change extension in ".xls", open = ok Calc ... failed with 4 "é", ok with 5 "é" etc ...
@Fridrich: This file is erroneously detected by libwpd. Can it be solved in some way on the libwpd side, or we need to think of another solution?
I have a big problem with this. I see that libwpd is detecting it as a WP 4.2 file and I understand why. Now, if you change the extension of this file to *.sylk, it will load well. This can be a workaround for this kind of rare cases. The problem lies in the fact that WP 4.2 file-format is a text file without header with WordPerfect codes embedded. We use a dry-parsing heuristics to detect this kind of files. We try to check whether the "codes" in the file follow the logic that "codes" in a WP file would follow. The problem is that this file contains an even number of "é" characters encoded as 0xE9. OxE9 in a WP42 file is a variable length function and we normally scan for a closing 0xE9 if we find an openning one. Them being in even numbers make us believe that it is a WP42 file with two 0xE9 codes. The problem is that I cannot make much with this kind of logic. Without this heuristics I am unable to detect any WP42 file at all. It is true that some special cases of text files can pass through this filter, but the workaround is to rename the extension of the file.
(In reply to comment #6) > I have a big problem with this. I see that libwpd is detecting it as a WP > 4.2 file and I understand why. Now, if you change the extension of this file > to *.sylk, it will load well. This can be a workaround for this kind of rare > cases. > > The problem lies in the fact that WP 4.2 file-format is a text file without > header with WordPerfect codes embedded. We use a dry-parsing heuristics to > detect this kind of files. We try to check whether the "codes" in the file > follow the logic that "codes" in a WP file would follow. The problem is that > this file contains an even number of "é" characters encoded as 0xE9. OxE9 in > a WP42 file is a variable length function and we normally scan for a closing > 0xE9 if we find an openning one. Them being in even numbers make us believe > that it is a WP42 file with two 0xE9 codes. The problem is that I cannot > make much with this kind of logic. Without this heuristics I am unable to > detect any WP42 file at all. It is true that some special cases of text > files can pass through this filter, but the workaround is to rename the > extension of the file. Thank you for your explainations. It is a logical way to rename the extension as is their true format. This behavior is different from previous version of LibreOffice, OpenOffice or MsExcel ... We have to adapt legacy applications that generate such files. I am very impressed with your responsiveness.
*** Bug 82393 has been marked as a duplicate of this bug. ***
So all the text file formats are sacrificed because of one obscure format which botches up the autodetection? WP4.2 should be triggered by extension only or by manual filter selection, not applied to any file then.
*** Bug 80016 has been marked as a duplicate of this bug. ***
*** Bug 64894 has been marked as a duplicate of this bug. ***
*** Bug 96098 has been marked as a duplicate of this bug. ***
Created attachment 120839 [details] RTF file with .doc extension Attached another testcase. The same bug happens for rich text format files (rtf) using a .doc extension (this was quite common in the past). Did anyone bisected this to understand the commit that caused the problem? I see a simple solution in lowering the priority of wordperfect import filter, aka trying to load the file with all the others formats first, and then testing wordperfect as the last one. Could this work?
Reopening: is there a reason why this has been marked as wontfix without an explanation?
(In reply to Fabio Bas from comment #13) > Did anyone bisected this to understand the commit that caused the problem? What exactly do you want to bisect? The problem is explained in comment 6. There is no way to make the WP detection "smarter" about this. > I see a simple solution in lowering the priority of wordperfect import > filter, aka trying to load the file with all the others formats first, and > then testing wordperfect as the last one. Could this work? This is an interesting idea. Looking at the detection priority list in filter/source/config/cache/typedetection.cxx reveals that the WP detection has indeed higher priority than other text based formats. I tried to lower it, and it indeed solved the issue. Will be good to do more tests with it, before pushing such change. Unfortunately such approach won't solve all issues, e.g. csv or just plain text files with strange extension (see the duplicates of this bug) would still fail, because there is no way to "detect" such files before the WP detection catches them. (In reply to Fabio Bas from comment #14) > Reopening: is there a reason why this has been marked as wontfix without an > explanation? Well, the explanation is in comment 6. Anyway REOPENED isn't the right status for this, let's keep it as NEW instead.
(In reply to Maxim Monastirsky from comment #15) > This is an interesting idea. Looking at the detection priority list in > filter/source/config/cache/typedetection.cxx reveals that the WP detection > has indeed higher priority than other text based formats. That's because only WP 4.2 format is text-based. The other WordPerfect formats supported by libwpd are binary. And the list prioritizes binary formats. But if moving it down the list works, I've got nothing against it.
(In reply to David Tardon from comment #16) > That's because only WP 4.2 format is text-based. Yes, and also WP1. > And the list prioritizes binary formats. BTW, there are some odd things in this list, like calc_SYLK & calc_DIF which are text based formats, but listed together with binary formats. Any idea why it was done that way? > But if moving it down the list works, I've got nothing against it. And by moving it below "generic_HTML", it should also be possible to avoid workarounds like the "calc_HTML" one.
(In reply to Maxim Monastirsky from comment #17) > BTW, there are some odd things in this list, like calc_SYLK & calc_DIF which > are text based formats, but listed together with binary formats. Any idea > why it was done that way? Not really. Maybe "binary" is used loosely, in the sense "if the format has a standard header, which can be used for detecting it, it is binary"? It would explain why T602 is in that section too. > And by moving it below "generic_HTML", it should also be possible to avoid > workarounds like the "calc_HTML" one. Yes, likely.
We have just encountered a minor variation of this problem at a site where the opening of file with .xls extensions has been working flawlessly on an old version of Libreoffice. The upgrade has had shall we say deleterious effects. The behaviour on LibreOffice 5.3.2.2 on Mint and OpenSuse Leap 42 however is erratic - sometimes LibreOffice opens with scalc and other times ignores the switch on the command line that specifies --calc. The files generated are simple text files containing tab delimited data from a single application on a Linux server. The number of columns can vary from file to file but the basic structure is the same. The files are generated "on the fly" and LibreOffice is invoked by the application on demand; the user has no choice and does not have the facility to change extension from .xls to .csv; i.e the files are opened "for them" The file formats are identified by file -b as either ISO-8858 text with CR line terminators or ASCII text with CR line terminators. However changing the extension of the files identified as ASCII from .xls extension to .csv results in the file opening with calc yet the file type remains ASCII. If it will help or contribute in any way to resolving what has now become a significant problem in the life of the users affected by this feature, example files of those that open in calc and those that open in writer can be posted Having read the explanation regarding filtering and parsing etc (comment 8 I think) it just seems to me and the users affected by this feature, that when a user specifies the application to use (in this case calc) it should be left to the user to override/determine the application . Does Microsoft Office decide to use Word when a user opens a file with Excel? Surely there must be a way to override LibreOffice and determine/decide what application is required. Kind regards
** Please read this message in its entirety before responding ** To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from http://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://kiwiirc.com/nextclient/irc.freenode.net/#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
The bug is still present. Libreoffice info: Version: 6.0.2.1 Build ID: f7f06a8f319e4b62f9bc5095aa112a65d2f3ac89 CPU threads: 4; OS: Mac OS X 10.13.4; UI render: default; Locale: it-IT (it_IT.UTF-8); Calc: group
*** Bug 133282 has been marked as a duplicate of this bug. ***
Dear Jean-Luc, To make sure we're focusing on the bugs that affect our users today, LibreOffice QA is asking bug reporters and confirmers to retest open, confirmed bugs which have not been touched for over a year. There have been thousands of bug fixes and commits since anyone checked on this bug report. During that time, it's possible that the bug has been fixed, or the details of the problem have changed. We'd really appreciate your help in getting confirmation that the bug is still present. If you have time, please do the following: Test to see if the bug is still present with the latest version of LibreOffice from https://www.libreoffice.org/download/ If the bug is present, please leave a comment that includes the information from Help - About LibreOffice. If the bug is NOT present, please set the bug's Status field to RESOLVED-WORKSFORME and leave a comment that includes the information from Help - About LibreOffice. Please DO NOT Update the version field Reply via email (please reply directly on the bug tracker) Set the bug's Status field to RESOLVED - FIXED (this status has a particular meaning that is not appropriate in this case) If you want to do more to help you can test to see if your issue is a REGRESSION. To do so: 1. Download and install oldest version of LibreOffice (usually 3.3 unless your bug pertains to a feature added after 3.3) from https://downloadarchive.documentfoundation.org/libreoffice/old/ 2. Test your bug 3. Leave a comment with your results. 4a. If the bug was present with 3.3 - set version to 'inherited from OOo'; 4b. If the bug was not present in 3.3 - add 'regression' to keyword Feel free to come ask questions or to say hello in our QA chat: https://web.libera.chat/?settings=#libreoffice-qa Thank you for helping us make LibreOffice even better for everyone! Warm Regards, QA Team MassPing-UntouchedBug
Bug is still present, i just tested both attachments Version: 7.3.4.2 / LibreOffice Community Build ID: 30(Build:2) CPU threads: 8; OS: Linux 5.18; UI render: default; VCL: gtk3 Locale: it-IT (it_IT.UTF-8); UI: it-IT SlackBuild for 7.3.4 by Eric Hameleers Calc: threaded