If I run in the shell either this
$ grep hello
or this
$ socat tcp-listen:12345,fork -
the result is the same, i.e. the program blocks, waiting for standard input.
Why, wrapping those two processes in the following Haskell programs (which differ only in the arguments to passed to proc
) produces different results?
Indeed, when this one
module Main where
import System.Process
main :: IO ()
main = do
_ <- createProcess (proc "grep" ["hello"])
{std_in = CreatePipe, std_out = CreatePipe}
return ()
returns, it doesn't leave a grep
process running, at least based on pidof grep
, wheras this other one
module Main where
import System.Process
main :: IO ()
main = do
_ <- createProcess (proc "socat" ["tcp-listen:12345,fork", "-"])
{std_in = CreatePipe, std_out = CreatePipe}
return ()
seems to leave socat
running, according to pidof
.
I am experimenting with System.Process
and Control.Concurrent.Async
for the purpose of spawning an external process, as well as threads to control its standard input and output.
However, while playing around with those libraries, I've managed to create a situation where 2 instances of my Haskell program communicate with each other (via the external socat
process), the threads of each program communicate via an MVar
to decide what to do, the two programs exit with 0, and yet... one of the two external socat
processes spawned via System.Process.createProcess
is left running, so I have to manually kill
it.
Sure, maybe the issue is with the program I wrote beside the usage of createProcess
, but to start with, I want to make sure I understand how createProgram
syhould be used.
So the question is: once I execute something like this
import System.Process
main :: IO ()
main = do
args <- getArgs
(Just i, Just o, Nothing, h) <- createProcess (proc "socat" args)
{std_in = CreatePipe, std_out = CreatePipe}
-- rest
who or what is responsible for putting down the process spawned by createProcess
?
After all, -- rest
executes immediately after the call to createProcess
, so while -- rest
executes, the shell program, in this case socat
is running on its own. Is it up to me to guarantee correct lifetime management? Should I make use of h
for this purpose?
One experiment is this:
import System.Process
main :: IO ()
main = do
_ <- createProcess (proc "cat" ["/dev/zero"])
{std_in = CreatePipe, std_out = CreatePipe}
return ()
Since cat /dev/zero
never returns (I've tried in the Bash shell), shouldn't this program terminate leaving cat
running? I.e., after the Haskell program terminates successfully, shouldn't pidof cat
return some PID?
I've tried, and it doesn't, making me think that something is cleaning up.
The reason why grep blocks instead of terminating in a terminal is that it stays open as long as its stdin handle stays open, which is forever in a terminal. In the case of Haskell, the stdin
handle you provide to grep using CreatePipe
will close as soon as the Haskell program terminates, which will in turn make grep close.
On the other hand, socat
doesn't care about its stdin handle and will stay running for as long as its network handle stays open, which is why it stays open forever in both cases.
You can test this difference in behavior in a terminal by ensuring that the stdin gets closed, e.g. like this
$ echo | grep hello
<immediately exits>
$ echo | socat tcp-listen:12345,fork -
<runs forever>
However, you can configure socat to care about handles being closed:
$ echo | socat tcp-listen:12345,fork -,end-close
<doesn't close immediately, but does close as soon as someone connects>
In the case of cat, it cares about both its stdin and stdout handles and will keep running until either stdin has closed and it has pushed all input from stdin to stdout or stdout has been closed, whichever happens first.
In your haskell example, the stdin handle (/dev/zero
) will stay open forever, but the stdout handle (from CreatePipe
) will be closed as soon as the haskell process closes.
You can test this behavior by using less and pressing q
to exit early:
cat < /dev/urandom | hexdump -C | less
will terminate cat despite the input stream being infinite.
Now finally, for your main question: Yes, you are responsible for the lifetime of the processes you spawn. Haskell won't automatically shut them down for you, it will only close the handles it created, which some processes will interpret as a sign to shut down and others will not.
And yes, the ProcessHandle h
is indeed the way you can force kill the process, either using terminateProcess
or cleanupProcess
. There is no built in way to gracefully terminate the process, see this issue, other than indirectly by closing the handles for processes that support that. You can however send a signal using the unix
package.
One standard way of handling cleanup steps like this is using bracket
which will ensure that the cleanup step happens both on normal exit and in case of exceptions.
import System.Process
import Control.Exception (bracket)
-- | Create a process and force terminate it and close its handles at the end of the local scope
withCreateProcess :: CreateProcess -> ((Maybe Handle, Maybe Handle, Maybe Handle, ProcessHandle) -> IO a) -> IO a
withCreateProcess args = bracket (createProcess args) cleanupProcess