Bug 133789 - BASIC function DIR does not deliver correct result list
Summary: BASIC function DIR does not deliver correct result list
Status: NEW
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: BASIC (show other bugs)
Version:
(earliest affected)
5.4 all versions
Hardware: All Windows (All)
: medium normal
Assignee: Not Assigned
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: Help Macro
  Show dependency treegraph
 
Reported: 2020-06-08 10:49 UTC by joachim
Modified: 2020-11-30 10:29 UTC (History)
5 users (show)

See Also:
Crash report or crash signature:
Regression By:


Attachments
Basic macro creates 4 files in path actual running and lists DIR result in spreadsheet (17.23 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-06-08 10:55 UTC, joachim
Details
enhanced example (Tools-Library activated) (15.03 KB, application/vnd.oasis.opendocument.spreadsheet)
2020-10-19 15:11 UTC, joachim
Details
VBA code sample (722 bytes, text/plain)
2020-11-16 16:26 UTC, Alain Romedenne
Details
LibO Basic code sample (861 bytes, text/plain)
2020-11-16 16:26 UTC, Alain Romedenne
Details

Note You need to log in before you can comment on or make changes to this bug.
Description joachim 2020-06-08 10:49:48 UTC
Description:
Using DIR-function with placeholder characters (* or ?) results in a CASE-SENSITIV file search. E.g. a search-string of 'T*' (capital T) will list a file named 'Test1.txt' but not the 'file 'test2.txt' (because of lower-case t). 
When searching for a file-name without placeholder characters the function delivers (in case of existence of an appropriate file) the given search-string but not the real file name: Using the search-string 'TEST4.TXT' results in 'TEST4.TXT' even if the file-name in the system is 'test4.txt'.
The same happens when searching for paths (using 16 as Parm2).
The error occurs when using URL-notation as well as using Windows path-notation.
Error exists in version 6.3.6.2 and was already existing in the year 2016 (version ?).

Steps to Reproduce:
1. see the attached BASIC-Macro embedded in a CALC-file
2.
3.

Actual Results:
DIR-function produces file-list with missing file-name

Expected Results:
DIR-function should produce file-list independent of upper- / lower-case characters


Reproducible: Always


User Profile Reset: No



Additional Info:
no restriction regarding upper- / lower-case or case-sensitivity is mentioned in the DIR-function help-text.
There are some sources mentioning that a placeholder can only be the last character within a search-string. But this seems to be nonsense, at least the error occurs even if the placeholder is a last character
Comment 1 joachim 2020-06-08 10:55:13 UTC
Created attachment 161760 [details]
Basic macro creates 4 files in path actual running and lists DIR result in spreadsheet
Comment 2 Buovjaga 2020-10-19 13:00:33 UTC
I tried running the macro in the file and I errored with

BASIC runtime error.
Sub-procedure or function procedure not defined.

pointing to line 16:
PFAD = DirectoryNameoutofPath(URL, "/")

Please help and give more specific instructions.

Set to NEEDINFO.
Change back to UNCONFIRMED after you have provided the information.
Comment 3 joachim 2020-10-19 15:11:00 UTC
Created attachment 166514 [details]
enhanced example (Tools-Library activated)

Sorry, there was missing the activation of the LO-Tools-Library
Comment 4 Buovjaga 2020-10-19 15:38:15 UTC
Ok, now it runs, but I'm not sure what the point is. Please explain in detail what we should be looking at, what is the bad result etc.
Comment 5 joachim 2020-10-19 15:55:34 UTC
please read the text in the spreadsheet. The comment beneath the green-zero says: test-case 0 delivers a correct result. 
the output of test-case 0 is shown in the range A2:C8, with a DIR-parm '\' (B2)

test-cases 1 to 5 give wrong results. Depending on the DIR-Parm (B9:B17) the DIR output is in each case incomplete. In any test-case some files are not found. Obviously DIR works CASE-SENSITIVE with the given DIR-parms.

test-case 6 finds the file the DIR-parm is aking for. But the result is not the true file name - as used on the storage device - but is the parm-name itself.
Comment 6 Buovjaga 2020-10-19 16:02:15 UTC
My result after running the macro:
C2 
C3 vboxshare\TESTdatei03.TXT
C4 vboxshare\TESTDATEI02.txt
C5 vboxshare\TESTdatei01.txt
C6 vboxshare\testdatei04.txt
C7 vboxshare\TESTDATEI02.txt
C8 vboxshare\TESTdatei01.txt
C9 
C10 vboxshare\TESTDATEI02.txt
C11 vboxshare\TESTdatei01.txt
C12 
C13 testdatei04.txt
C14 TESTdatei01.txt
C15 TESTDATEI02.txt
C16 
C17 TESTdatei01.txt
C18 TESTDATEI02.txt
C19 TestDATEI04.TxT

Is it in line with the badness you see?

Btw. you forgot your instructions in the file are in German and not everyone knows it.

Arch Linux 64-bit
Version: 7.1.0.0.alpha0+
Build ID: ccdb78773ac6c9d19140e8084f37cc2c7f06240e
CPU threads: 8; OS: Linux 5.8; UI render: default; VCL: kf5
Locale: fi-FI (fi_FI.UTF-8); UI: en-US
Calc: threaded
Built on 18 October 2020
Comment 7 joachim 2020-10-19 16:59:56 UTC
    ---------- the according DIR calling parameters are shown in column B
    v
    v
        My result after running the macro:
               C2                                 <--- I wonder about the empty C2
0   \          C3 vboxshare\TESTdatei03.TXT
               C4 vboxshare\TESTDATEI02.txt
               C5 vboxshare\TESTdatei01.txt
               C6 vboxshare\testdatei04.txt
C3 to C6 show the correct number of files with correct names using upper- and lower-case letters
using the path name only <\> as calling parm

1   \T*        C7 vboxshare\TESTDATEI02.txt
               C8 vboxshare\TESTdatei01.txt
               C9                                <--- I wonder about the empty C9 
the calling parm shown in C13 <\T*> should find the same files as are found in test-case 0 using <\> as calling parm. But only 2 instead of 4 files are listed 
2  \t*.txt     C10 vboxshare\TESTDATEI02.txt
               C11 vboxshare\TESTdatei01.txt
               C12                               <--- I wonder about the empty C12  
with the calling parm <\t*.txt> the DIR function should find all 4 files. But only 2 are found 

I am not sure how to allocate the following results to the test-cases Nos. 3 - 5
C13 testdatei04.txt
C14 TESTdatei01.txt
C15 TESTDATEI02.txt
C16 
C17 TESTdatei01.txt
C18 TESTDATEI02.txt

6  \TestDATEI04.TxT   C19 TestDATEI04.TxT
yes, the file is found, but the result shown by DIR is the string of the calling-parm in 
mixed upper-/lower-case writing and not the name of the object in the storage which is a 
lower-case only name
########################################
on my Win 10 system the result looks as follows:
---------------------------------------------------
A   B                   C
---------------------------------------------------
0	\	                .~lock.LOtestDIR.ods#
		                LOtestDIR.ods       <-- the application itself
		                TESTdatei01.txt
		                TESTDATEI02.txt
		                TESTdatei03.TXT
		                testdatei04.txt
		                Thumbs.db               <--system file of the WIN10-system
1	\T*	                TESTdatei01.txt
		                TESTDATEI02.txt
		                TESTdatei03.TXT
		                Thumbs.db
2	\t*.txt	            testdatei04.txt
3	\T*.txt	            TESTdatei01.txt
		                TESTDATEI02.txt
4	\testDATEI0?.txt                           <-- erroneously no file is found	
5	\TEST*0?.txt	    TESTdatei01.txt
		                TESTDATEI02.txt
6	\TestDATEI04.TxT	TestDATEI04.TxT
##########################################################
as far as I see your results are incomplete and my results are incomplete.
Curiously there seems to be different erros, see test-case 2: your system finds 2 files of 4 
and my system finds 1 file only; but this file is not shown in your list.
Comment 8 Buovjaga 2020-10-19 18:21:16 UTC
Alain: any comments on this script puzzle?
Comment 9 joachim 2020-10-20 09:24:59 UTC
I fear my example is too complicated. 
Following an attempt to describe the problem in simple words.
File names in the windows-system are allowed with lower- and upper-case letters, even in mixed writing.
Thus valid names are e.g.: TEST01.TXT as well as TEST02.txt as well as tesT03.TxT
If you are looking for files in your system using the DIR-function with a 
Parm-1 string    <path\t*.txt>  
then the DIR function does NOT list all 3 of the above named files. <----- ERROR
If you are searching with    <path\> 
then the DIR function will find all 3 files.
Comment 10 Alain Romedenne 2020-11-16 16:26:00 UTC
Created attachment 167334 [details]
VBA code sample
Comment 11 Alain Romedenne 2020-11-16 16:26:35 UTC
Created attachment 167335 [details]
LibO Basic code sample
Comment 12 Alain Romedenne 2020-11-16 16:31:11 UTC
Observations made under Windows 10: 

When using Dir function, I do not see runtime differences, regarding the use of filename patterns, between LibreOffice Basic an VBA compatibility mode. 

However, Joachim statement is correct, LO Basic Dir function is case-sensitive while M$-VBA Dir function is case-INsensitive. 

We may:
1. Keep Dir function unchanged for backward compatibility and
   document this LO/M$ difference in behavior within Dir function help page
2. Have a Dir function behaving differently when VBA compatibility is set.
3. Add an extra optional third argument in Dir Function, similar to Replace function 'compare' argument.

Filenames pattern syntax, as offered in Basic Dir function, is much poorer than what's available using regular expressions. I would devote filename pattern matching to the latter, and retain case-sensivity since it accompanies all 'recent' operating systems.

I vote for option 1, while leaving place for supplemental observations and different opinions. 

PS: I add Documentation and Macros META bugs references
Comment 13 joachim 2020-11-16 19:58:13 UTC
As far as I see the option (1) is not very helpful. On Windows systems the naming of file-objects was and is case-INsensitive. What is the purpose of using the DIR-function?: Normally you want to find one or more - but in any case ALL - files of which you do NOT know the exact name. 
The mentioned "backward compatibility" is not useful as this compatibility in fact simply is hiding a continuous error. And I see no disadvantage for existing applications if the DIR function in future would show a better/correct result.
Comment 14 Andreas Heinisch 2020-11-17 10:35:40 UTC
Maybe there exists even a fourth option: on Windows the dir function is case-insensitive whereas on all other system the function searches case-sensitive file names.
Comment 15 himajin100000 2020-11-17 11:56:42 UTC
Do we need to test against this one too?

https://www.howtogeek.com/354220/how-to-enable-case-sensitive-folders-on-windows-10/

btw, still almost nothing about its algorithm has sunk into my mind yet, but here is a possible code pointer:

https://opengrok.libreoffice.org/xref/core/tools/source/fsys/wldcrd.cxx?r=263e04b6#45

https://opengrok.libreoffice.org/xref/core/basic/source/runtime/methods.cxx?r=93c64a61#2771
Comment 16 Alain Romedenne 2020-11-17 14:08:54 UTC
Joachim,
Please I tagged tagged this bug as New, meaning it's accepted. Finding a way to fix it may require extra analysis next a volunteer developer taking ownership for it.

---

Here's a link to filename patterns for Dir VBA function, and the possible wildcard characters:
https://trumpexcel.com/excel-wildcard-characters/

However, I made extra short tests in order examine how VBA behaves in the case of diacritics, such as accented characters. VBA seems to ignore them:

E or e are equivalent - while è é ë É Ë È are omitted
N or n are equivalent - while ñ Ñ are omitted

VBA behaviour predates western languages code pages and most certainly affects solely the first 128 characters. Extra specifications may be of interest in the case of non-latin character-based languages such as greek, arab, russian, chinese and so on.