The sal/ file abstraction is great, but has some slightly cumbersome uses that could be easily fixed with some more inline methods:
We have an osl::FileStatus class. A major use of this is detecting if a file is a File or a Directory (or a Link).
if( aStatus.getFileType() == osl::FileStatus::Directory )
bool bDirectory = fs.getFileType() == osl::FileStatus::Directory;
if ( aFileStatus.getFileType() == osl::FileStatus::Link )
etc. basic/source/runtime/methods.cxx even has a:
static inline sal_Bool isFolder( FileStatus::Type aType )
return ( aType == FileStatus::Directory || aType == FileStatus::Volume );
Which looks like a common programming error is hidden by this 'directory could be a folder and a mount point but then has type 'volume'.
To help this it'd be great to have some inline methods:
that essentially check the getFileType() value as appropriate, and then to do the cleanup across the code to use these.
Created attachment 58107 [details]
First version of proposed new functions
I started playing around in file.hxx, and put together these three funcs.
Do these look correct? I'm trying to get a build to test them but it seems I'm going to need a full-rebuild, because I just pulled.
If they look okay I'll write the others then commit + push. Then we can start using them in the code.
Looks good - I'd get the 'isDirectory' to check for 'isVolume' too by default - AFAICS a volume is a directory and not doing that is a bug.
Changing sal/ headers of course causes some big re-build :-) You'll prolly want to add a "Since LibreOffice 3.6" tag or somesuch in the documentation (hunt around for examples).
Otherwise that's great, though of course changing the code to use it is the big thing :-)
What about isFile? Should that return true for links too? They mostly behave like files, and the caller can always use isLink or the old methods
Good question, I -guess- that most 'isFile' calls that return false for linked files are sucky bugs that will bust symlinks ;-)
Perhaps we need 'isRegularFile()' ;-) as a name instead of isFile - to denote that it is not a link.
Anything else would be slower - walking an entire stat chain, and coping with broken / circular symlinks etc. is a bit of a pain really :-) And of course links can point to directories or files - so ... I'd just go with the rename above :-)
- directories and volumes
- files and links
- I didn't think about link-to-directory
- that probably means this code has a logic error
- maybe this should be changed or removed
- a regular file
- a link
(In reply to comment #6)
> - files and links
> - I didn't think about link-to-directory
> - that probably means this code has a logic error
> - maybe this should be changed or removed
Yes, please remove.
yep - it's sad that the interface has to have this unfortunate 'isLink' ambiguity that rather complicates symlinked situations, at least the getLinkTargetURL() is a fully resolved path for this case - ie. following all the symlinks down to the bottom via 'realpath'.
Which makes me wonder - why do we have salhelper/inc/salhelper/LinkResolver - which is substantially un-used, and un-necessary for Unix [ cf. the above ].
I'd be eager to bin / deprecate / write-that out.
The isDirectory method creates the obvious cleanup in basic/ eg. of removing:
basic/source/runtime/methods.cxx-static inline sal_Bool isFolder( FileStatus::Type aType )
But - of course isDirectory is also imprecise in this case, so all this code breaks for symlinked directories as well as symlinked files :-)
git grep 'FileStatus.*Link' | nl
shows we have only 43 sites checking for and handling Links properly of which 20 are in sal & unit tests ;-)
git grep 'getFileStatus' | nl
shows 187 hits (100 in sal) ... so - in general we get link handling wrong it seems in a large majority of cases.
So - we should fix this by default in our API. IMHO Link handling is a minority use-case, primarily useful on writing files - to make sure we get the semantics of unlink-replace vs. write-through-link right.
But of course we do a lot more reading than writing, and ~all directory / file traversing code is broken.
Ergo - I think we need to tweak getFileStatus to do a better job by default.
I'll attach a patch; I think we need to go around checking each of those 87+ getFileStatus calls, to see if they are non-link-handling, and add a 'true' parameter if they are not.
Then (I guess) we don't need to worry about the Link type being returned it never will be. Concrete, actionable thoughts / improvements ?
Created attachment 58227 [details]
>> - files and links
>> - I didn't think about link-to-directory
>> - that probably means this code has a logic error
>> - maybe this should be changed or removed
> Yes, please remove.
Comment on attachment 58227 [details]
Review of attachment 58227 [details]:
Looks fine to me, although I don't know this code as well as you probably do.
Deteted "Easyhack" from summary
fairly sure this ended up in master. I see something very like it anyway. I think if there's something remaining to be done a follow up bug is the way to go cause I'm confused :-)
Comment on attachment 58107 [details]
First version of proposed new functions
i'll mark the first patch as "obsolete" as it has been integrated,
so that it doesn't show up in bugzilla searches
Migrating Whiteboard tags to Keywords: (EasyHack,DifficultyBeginner,SkillCpp,TopicCleanup)