Search code examples
haskellrecursiondirectoryhaskell-pipes

Listing all the files under a directory recursively, using Pipes


I finished reading the Pipes tutorial, and I wanted to write a function to list all the files in a directory, recursively. I tried with the following code:

enumFiles :: FilePath -> Producer' FilePath (PS.SafeT IO) ()
enumFiles path =
  PS.bracket (openDirStream path) (closeDirStream) loop
  where
    loop :: DirStream -> Producer' FilePath (PS.SafeT IO) ()
    loop ds = PS.liftBase (readDirStream ds) >>= checkName
      where
        checkName :: FilePath -> Producer' FilePath (PS.SafeT IO) ()
        checkName ""   = return ()
        checkName "."  = loop ds
        checkName ".." = loop ds
        checkName name = PS.liftBase (getSymbolicLinkStatus newPath)
                         >>= checkStat newPath
          where newPath = path </> name

        checkStat path stat
          | isRegularFile stat = yield path >> loop ds
          | isDirectory stat = enumFiles path
          | otherwise = loop ds

However this producer will terminate as soon as the return () is reached. I guess I'm not composing it in the right way, but I fail to see what is the correct way of doing this.


Solution

  • Simply change this line:

    | isDirectory stat = enumFiles path
    

    to

    | isDirectory stat = enumFiles path >> loop ds
    

    The code was missing the recursion in this recursive case.

    You can also break this producer up into a composition of smaller producers and pipes:

    {-# LANGUAGE RankNTypes #-}
    
    module Main where
    
    import qualified Pipes.Prelude as P
    import qualified Pipes.Safe as PS
    
    import           Control.Monad
    import           Pipes
    import           System.FilePath.Posix
    import           System.Posix.Directory
    import           System.Posix.Files
    
    readDirStream' :: FilePath -> Producer' FilePath (PS.SafeT IO) ()
    readDirStream' dirpath =
      PS.bracket (openDirStream dirpath) closeDirStream (forever . loop)
      where
        loop stream =
          liftIO (readDirStream stream) >>= yield
    
    enumFiles :: FilePath -> Producer' FilePath (PS.SafeT IO) ()
    enumFiles path =
      readDirStream' path
        >-> P.takeWhile (/= "")
        >-> P.filter (not . flip elem [".", ".."])
        >-> P.map (path </>)
        >-> forever (do
                        entry <- await
                        status <- liftIO $ getSymbolicLinkStatus entry
                        when (isDirectory status) (enumFiles entry)
                        when (isRegularFile status) (yield entry))
    
    main :: IO ()
    main =
      PS.runSafeT $ runEffect (enumFiles "/tmp" >-> P.stdoutLn)
    

    I find it's often helpful to use forever from Control.Monad or one of the combinators from Pipe.Prelude instead of manual recursion; it helps cut down on small typos like this one. However, as the kids say, your mileage may very well vary.