I am currently going through the book Real World Haskell and one exercise from this book asks the reader to implement file name matching with the use of **
, which is the same as *
, but also looks in subdirectories all the way down in the file system. Below is a fragment of my code with comments (there is a lot of duplication at the moment) and further down you can find additional info about the code. I think that the posted code is sufficient for the problem and there is no need to list the whole program here.
case splitFileName pat of
("", baseName) -> do -- just the file name passed
curDir <- getCurrentDirectory
if searchSubDirs baseName -- check if file name has `**` in it
then do
contents <- getDirectoryContents curDir
subDirs <- filterM doesDirectoryExist contents
let properSubDirs = filter (`notElem` [".", ".."]) subDirs
subDirsNames <- forM properSubDirs $ \dir -> do
namesMatching (curDir </> dir </> baseName) -- call the function recursively on subdirectories
curDirNames <- listMatches curDir baseName -- list matches in the current directory
return (curDirNames ++ (concat subDirsNames)) -- concatenate results into a single list
else listMatches curDir baseName
(dirName, baseName) -> do // full path passed
if searchSubDirs baseName
then do
contents <- getDirectoryContents dirName
subDirs <- filterM doesDirectoryExist contents
let properSubDirs = filter (`notElem` [".", ".."]) subDirs
subDirsNames <- forM properSubDirs $ \dir -> do
namesMatching (dirName </> dir </> baseName) -- call the function recursively on subdirectories
curDirNames <- listMatches dirName baseName -- list matches in the passed directory
return (curDirNames ++ (concat subDirsNames)) -- concatenate results into a single list
pat
is the pattern I'm looking for (e.g. *.txt
or C:\\A\[a-z].*
).
splitFileName
is a function which splits a file path into the directory path and the file name. The first element of the tuple will be empty if we specify just a file name in pat
.
searchSubDirs
returns True
if the file name has **
in it.
listMatches
returns a list of file names that match the pattern in the directory, substituting **
for *
.
namesMatching
is the name of the function whose excerpt I posted.
When I pass just the file name, the program searches for it only in the current directory and first level of subdirectories. When I pass a full path, it searches only in the specified directory. It looks like case (dirName, baseName)
doesn't properly recurse. I've been looking at the code for some time now and I can't figure out where the problem is.
If any more information is needed, please let me know in the comments and I'll add whatever is necessary to the question.
Here's an issue:
contents <- getDirectoryContents dirName
subDirs <- filterM doesDirectoryExist contents
getDirectoryContents
only returns the leaf names of the directories, so you have to prepend dirName
(along with a /
) to the elements of contents
before calling doesDirectoryExist
.