I am trying to read first line of each file in current directory:
import System.IO(IOMode(ReadMode), withFile, hGetLine)
import System.Directory (getDirectoryContents, doesFileExist, getFileSize)
import System.FilePath ((</>))
import Control.Monad(filterM)
readFirstLine :: FilePath -> IO String
readFirstLine fp = withFile fp ReadMode System.IO.hGetLine
getAbsoluteDirContents :: String -> IO [FilePath]
getAbsoluteDirContents dir = do
contents <- getDirectoryContents dir
return $ map (dir </>) contents
main :: IO ()
main = do
-- get a list of all files & dirs
contents <- getAbsoluteDirContents "."
-- filter out dirs
files <- filterM doesFileExist contents
-- read first line of each file
d <- mapM readFirstLine files
print d
It is compiling and running but getting aborted with following error at a binary file:
mysrcfile: ./aBinaryFile: hGetLine: invalid argument (invalid byte sequence)
I want to detect and avoid such files and go on to next file.
A binary file is a file that contains byte sequences that can not be decoded to a valid string. But a binary file is not different from a text file if you do not inspect its content.
It might be better to use an "It's Easier to Ask Forgiveness than Permission (EAFP)" approach: we try to read the first line, and if that fails, we ignore the output.
import Control.Exception(catch, IOException)
import System.IO(IOMode(ReadMode), withFile, hGetLine)
readFirstLine :: FilePath -> IO (Maybe String)
readFirstLine fp = withFile fp ReadMode $
\h -> (catch (fmap Just (hGetLine h))
((const :: a -> IOException -> a) (return Nothing)))
For a FilePath
this returns an IO (Maybe String)
. If we run the IO (Maybe String)
, it will return a Just x
with x
the first line if it can read such file, and Nothing
if an IOException
was encoutered.
We can then make use of catMaybes :: [Maybe a] -> [a]
to obtain the Just x
s:
import Data.Maybe(catMaybes)
main :: IO ()
main = do
-- get a list of all files & dirs
contents <- getAbsoluteDirContents "."
-- filter out dirs
files <- filterM doesFileExist contents
-- read first line of each file
d <- mapM readFirstLine files
print (catMaybes d)
or you can make use of mapMaybeM :: Monad m => (a -> m (Maybe b)) -> [a] -> m [b]
in the extra
package [Hackage] that will automate that work for you.