Search code examples
posixsmlsmlnj

Capturing stdout of a command in SML


I'm trying to capture the output of a command run using Posix.Process.execp. I ported some C code I found on stackoverflow and can capture output for one execution, but I can't get the output for a second execution.

Here's my function:

(* Runs a command c (command and argument list) using Posix.Process.execp. *)
(* If we successfully run the program, we return the lines output to stdout *)
(* in a list, along with SOME of the exit code. *)
(* If we fail to run the program, we return the error message in the list *)
(* and NONE. *)
fun execpOutput (c : string * string list) : (string list * Posix.Process.exit_status option) =
  let fun readAll () = case TextIO.inputLine TextIO.stdIn
                    of SOME s => s :: (readAll ())
                     | NONE => []
      (* Create a new pipe *)
      val { infd = infd, outfd = outfd } = Posix.IO.pipe ()
  in case Posix.Process.fork ()
      of NONE => (
      (* We are the child. First copy outfd to stdout; they will *)
      (* point to the same file descriptor and can be used interchangeably. *)
      (* See dup(2) for details. Then close infd: we don't need it and don't *)
      (* want to block because we have open file descriptors laying around *)
      (* when we want to exit. *)
      ( Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stdout }
      ; Posix.IO.close infd
      ; Posix.Process.execp c )
      handle OS.SysErr (err, _) => ([err], NONE) )
       | SOME pid =>
     (* We are the parent. This time, copy infd to stdin, and get rid of the *)
     (* outfd we don't need. *)
     let val _ = ( Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }
                 ; Posix.IO.close outfd )
         val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
     in (readAll (), SOME status) end
  end

val lsls = (#1 (execpOutput ("ls", ["ls"]))) @ (#1 (execpOutput ("ls", ["ls"])))
val _ = app print lsls

and here's the corresponding output:

rak@zeta:/tmp/test$ ls
a  b  c
rak@zeta:/tmp/test$ echo 'use "/tmp/mwe.sml";' | sml
Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
- [opening /tmp/mwe.sml]
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
a
b
c
val execpOutput = fn
  : string * string list -> string list * ?.POSIX_Process.exit_status option
val lsls = ["a\n","b\n","c\n"] : string list
val it = () : unit
-

Any suggestions on what I'm doing wrong?


Solution

  • My initial attempt consisted of

    1. Creating a pipe
    2. Setting stdout of the child to be the write end of the pipe
    3. Setting stdin of the parent to be the read end of the pipe

    This didn't work the second time around, perhaps because of some race condition (running it under strace -f meant that we could see the second child writing to the write end of the second pipe, but the parent never managed to read from the read end of the second pipe). I realised this approach is also suboptimal because it involves clobbering stdin.

    My officemate pointed out I was effectively trying to implement a variant of popen(3). A better approach is to, in fact, implement popen and return a file descriptor for the desired end of the pipe, rather than clobbering the parent's stdin/stdout. It also is symmetric, in that the user can specify if they want the read or the write end of the pipe. Here's what I came up with (feedback welcome).

    structure Popen :>
          sig
              (* Parent wants to write to stdin, read stdout, or read stdout + stderr *)
              datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE
              val popen : string * pipe_type -> Posix.IO.file_desc
              val pclose : Posix.IO.file_desc -> Posix.Process.exit_status option
          end =
    struct
    
    datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE
    
    type pinfo = { fd : Posix.ProcEnv.file_desc, pid : Posix.Process.pid }
    
    val pids : pinfo list ref = ref []
    
    (* Implements popen(3) *)
    fun popen (cmd, t) =
      let val { infd = readfd, outfd = writefd } = Posix.IO.pipe ()
      in case (Posix.Process.fork (), t)
          of (NONE, t) => (* Child *)
         (( case t
             of PIPE_W => Posix.IO.dup2 { old = readfd, new = Posix.FileSys.stdin }
              | PIPE_R => Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
              | PIPE_RE => ( Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
                           ; Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stderr })
          ; Posix.IO.close writefd
          ; Posix.IO.close readfd
          ; Posix.Process.execp ("/bin/sh", ["sh", "-c", cmd]))
          handle OS.SysErr (err, _) =>
                 ( print ("Fatal error in child: " ^ err ^ "\n")
                 ; OS.Process.exit OS.Process.failure ))
           | (SOME pid, t) => (* Parent *)
         let val fd = case t of PIPE_W => (Posix.IO.close readfd; writefd)
                              | PIPE_R => (Posix.IO.close writefd; readfd)
                              | PIPE_RE => (Posix.IO.close writefd; readfd)
             val _ = pids := ({ fd = fd, pid = pid } :: !pids)
         in fd end
      end
    
    (* Implements pclose(3) *)
    fun pclose fd =
      case List.partition (fn { fd = f, pid = _ } => f = fd) (!pids)
       of ([], _) => NONE
        | ([{ fd = _, pid = pid }], pids') =>
          let val _ = pids := pids'
          val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
          val _ = Posix.IO.close fd
          in SOME status end
        | _ => raise Bind (* This should be impossible. *)
    end
    
    val f = Popen.popen("ls", Popen.PIPE_R);
    val g = Popen.popen("read line; echo $line>/tmp/foo", Popen.PIPE_W);
    val _ = Posix.IO.writeVec (g, Word8VectorSlice.full (Byte.stringToBytes "Hello World! I was written by g\n"));
    val h = Popen.popen("cat /tmp/foo", Popen.PIPE_R);
    val i = Popen.popen("echo 'to stderr i' 1>&2", Popen.PIPE_R);
    val j = Popen.popen("echo 'to stderr j' 1>&2", Popen.PIPE_RE);
    val _ = app (fn fd => print (Byte.bytesToString (Posix.IO.readVec (fd, 1000)))) [f, h, i, j];
    val _ = map Popen.pclose [f, g, h, i, j];
    val _ = OS.Process.exit OS.Process.success;
    

    And then the output is:

    rak@zeta:~/popen$ rm /tmp/foo && ls && sml popen.sml
    popen.sml
    Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
    [opening popen.sml]
    [autoloading]
    [library $SMLNJ-BASIS/basis.cm is stable]
    [library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
    [autoloading done]
    popen.sml:42.52 Warning: calling polyEqual
    structure Popen :
      sig
        datatype pipe_type = PIPE_R | PIPE_RE | PIPE_W
        val popen : string * pipe_type -> ?.POSIX_IO.file_desc
        val pclose : ?.POSIX_IO.file_desc -> ?.POSIX_Process.exit_status option
      end
    val f = FD {fd=4} : ?.POSIX_IO.file_desc
    val g = FD {fd=6} : ?.POSIX_IO.file_desc
    [autoloading]
    [autoloading done]
    val h = FD {fd=5} : ?.POSIX_IO.file_desc
    to stderr i
    val i = FD {fd=7} : ?.POSIX_IO.file_desc
    val j = FD {fd=8} : ?.POSIX_IO.file_desc
    popen.sml
    Hello World! I was written by g
    to stderr j
    

    Thanks to Simon Shine for the hint to run strace. I'm still not sure why what I had doesn't work, but at least we know what's happening.