Search code examples
haskellprocessgzipgunzip

process management and stream redirection in Haskell


In bash I can compose gzip and gunzip and get a command that outputs whatever I put into it.

> echo "test" | gzip | gunzip
test

I want to implement a function in Haskell that implements the same composition as above, using the input/output streams of the commands to compose them. My attempt is this:

module GZipID where

import System.Process
import System.IO

compress :: String -> IO String
compress input = do
    (Just hin, Just hout, _, _) <- createProcess (proc "gzip" []) { std_in  = CreatePipe
                                                                  , std_out = CreatePipe
                                                                  }
    hPutStr hin input
    (_, Just hfin, _, _) <- createProcess (proc "gunzip" []) { std_in  = UseHandle hout
                                                             , std_out = CreatePipe
                                                             }
    hGetContents hfin

However, when I load this up into GHCi and feed it "test" as input, it never terminates. There is some part of the process management that I don't seemt to understand. I tried to flush hin after hPutStr hin, but that changed nothing. Do I perhaps need to coerce the output to move through the hout pipe into gunzip?


Solution

  • The problem is that the hin handle is never EOF'd/closed, so gzip keeps waiting for more input and never terminates, so gunzip never terminates either, so hGetContents never stops waiting for results to come back, so hin never goes out of scope (which would close it).

    You could still get lazy strings through though with the right buffering settings, but that's fiddly and unsafe, let's not go there.

    The solution I'd suggest is to close the handle manually:

    compress input = do
        (Just hin, Just hout, _, _) <- createProcess (proc "gzip" []) {...}
        hPutStr hin input
        (_, Just hfin, _, _) <- createProcess (proc "gunzip" []) {...}
        hClose hin
        hGetContents hfin
    

    Of course, a better option would be to use a library that dispenses with the manual handle handling. Conduit should be able to do this nicely.