Search code examples
haskellrecursionio

String concatenation with each line in a text file in Haskell


I'm trying to write a code that'd read a given text file as its input and deliver the same text file with the length of each String at each line concatenated with the length of that String.

I have created a text file with a single String per line for this purpose. I've managed to write a code that gets a single line from the text file and outputs that line with its length shown as part of it, but I couldn't write a recursive version of this code, so that it'd continue doing this with every single line of the text file until there are no more lines left. It wouldn't have been a problem had I been working with lists but I can't use pattern matching with a text file full of Strings.

I just need to get the second code sample to apply itself to the entire list, but I can't. How can I change my code so that it'd work, without using functors/fmap? I'm really sorry for this stupid question, I'm fairly new to programming.

import System.IO
main :: IO()
main = do
   file <- openFile ".txt" ReadMode
   x <- hGetContents file >>=
   xs <- hGetLine 
   if null xs 
    then return ()
    else do
        putStrLn $ xs ++ " has a length of " ++ show (length xs)
   hClose file

or

main :: IO()
main = do
file <- openFile ".txt" ReadMode
x <- hGetLine file
xs <- hGetLine file
if null x 
  then return ()
  else do 
      putStrLn $ x ++ " has a length of " ++ show (length x)
      putStrLn $ xs ++ " has a length of " ++ show (length xs)
 hClose file

Solution

  • You seem to want a recursive solution. But the main action is not well suited to call itself recursively, as it has specific duties that happen once: opening and closing the file.

    So you need a separate recursive action, which assumes that file management is done by some layer above, and deals just with a pre-cooked file handle. Say we'll call it processFileHandle.

    With that sort of type signature:

    processFileHandle :: Handle -> IO ()  -- for now
    

    But, hold on a second ! We have this text transformation to do:

    xs ++ " has a length of " ++ show (length xs)
    

    Do we want to hardwire this sort of code into our processFileHandle function ? Absolutely not ! We want to separate text processing from file I/O. That way, we eschew the necessity of re-writing processFileHandle every time the sort of line transformation we have to do changes.

    So a much better type signature is:

    processFileHandle :: (String -> String) ->  Handle -> IO ()
    

    We provide the line transformation as an extra functional argument. In our case, this is:

    transformLine1 :: String -> String
    transformLine1 str =
        let  ln = length str
        in   str ++ " has a length of " ++ (show ln)
    

    Now, to proceed with processFileHandle, we need a way to gracefully detect an end of file condition. But a function for this is bound to have the type signature : Handle -> IO Bool

    So we submit this type signature into the Hoogle specialized search engine. And Hoogle points us towards the hIsEOF library function, which is just what we need.

    We can now write our main action, this is just:

    main :: IO ()
    main = do
        fh <- openFile  "foo.txt"  ReadMode
        processFileHandle transformLine1 fh
        hClose fh
    

    Now, we can provide the code for processFileHandle, as we can test for end of file:

    processFileHandle :: (String -> String) -> Handle -> IO ()
    processFileHandle fn fh =
      do
          atTheEnd <- hIsEOF fh    -- are we done ?
          if atTheEnd then
                          return ()  -- nothing left to do
                      else
                          do
                              line0 <- hGetLine fh
                              let  line1 = fn line0
                              putStrLn line1
                              processFileHandle fn fh  -- recursive call
    

    Testing:

    $ 
    $ cat foo.txt
    alpha
    beta
    epsilon
    eta
    $ 
    $ ghc --version
    The Glorious Glasgow Haskell Compilation System, version 8.8.4
    $ 
    $ ghc q68324502.hs -o ./q68324502.x
    [1 of 1] Compiling Main             ( q68324502.hs, q68324502.o )
    Linking ./q68324502.x ...
    $ 
    $ ./q68324502.x
    alpha has a length of 5
    beta has a length of 4
    epsilon has a length of 7
    eta has a length of 3
    $