Tcl [exec] process leaves zombie if the process forks and exits

I have a case when the Tcl script runs a process, which does fork(), leaves the forked process to run, and then the main process exits. You can try it out simply by running any program that forks to background, for example gvim, provided that it is configured to run in background after execution: set res [exec gvim].

The main process theoretically exits immediately, the child process runs in background, but somehow the main process hangs up, doesn't exit, stays in zombie state (reports as <defunct> in ps output).

In my case the process I'm starting prints something, I want that something and I want that the process exit and I state it done. The problem is that if I spawn the process using open "|gvim" r, then I cannot also recognize the moment when the process has finished. The fd returned by [open] never reports [eof], even when the program turns into zombie. When I try to [read], just to read everything that the process might print, it hangs up completely.

What is more interesting, is that occasionally both the main process and the forked process print something and when I'm trying to read it using [gets], I get both. If I close the descriptor too early, then [close] throws an exception due to broken pipe. Probably that's why [read] never ends.

I need some method to recognize the moment when the main process exits, while this process could have spawned another child process, but this child process may be completely detached and I'm not interested what it does. I want something that the main process prints before exitting and the script should continue its work while the process running in background is also running and I'm not interested what happens to it.

I have a control over the sources of the process I'm starting. Yes, I did signal(SIGCLD, SIG_IGN) before fork() - didn't help.

Solution

Ok, I found the solution after a long discussion here:

https://groups.google.com/forum/#!topic/comp.lang.tcl/rtaTOC95NJ0

The below script demonstrates how this problem can be solved:

#!/usr/bin/tclsh 

lassign [chan pipe] input output 
chan configure $input -blocking no -buffering line ;# just for a case :) 

puts "Running $argv..." 
set ret [exec {*}$argv 2>@stderr >@$output] 
puts "Waiting for finished process..." 
set line [gets $input] 
puts "FIRST LINE: $line" 
puts "DONE. PROCESSES:" 
puts [exec ps -ef | grep [lindex $argv 0]] 
puts "EXITING."

The only problem that remains is that there's still no possibility to know that the process has exited, however the next [exec] (in this particular case probably the [exec ps...] command did this) cleans up the zombie (No universal method for that - the best you can do on POSIX systems is [exec /bin/true]). In my case it was enough that I get one line that the parent process had to print, after which I can simply "let it go".

Still, it would be nice if [exec] can return me somehow the PID of the first process and there's a standard [wait] command that can block until the process exits or check its running state (this command is currently available in TclX).

Note that [chan pipe] is available only in Tcl 8.6, you can use [pipe] from TclX alternatively.