I am starting a process using execv
and letting it write to a file. I start a thread simultaneously that monitors the file so that it's size does not exceed a certain limit using stat.st_size
. Now, when the limit is hit, I waitpid
for the child process, but this throws an error and the process I start in the background becomes a zombie. When I do the stop using the same waitpid
from the main thread, the process is killed without becoming a zombie. Any ideas?
Edit: The errno
is 10 and waitpid
returns -1. This is on a linux platform.
This is difficult to debug without code, but errno
10 is ECHILD
.
Per the man page, this is returned as follows:
ECHILD
(forwaitpid()
orwaitid()
) The process specified bypid
(waitpid()
) oridtype
andid
(waitid()
) does not exist or is not a child of the calling process. (This can happen for one's own child if the action forSIGCHLD
is set toSIG_IGN
. See also the Linux Notes section about threads.)
In short, the pid
you are specifying is not a child of the process calling waitpid()
(or is no longer, perhaps because it has terminated).
Note the parenthetical section:
"This can happen for one's own child if the action for SIGCHLD
is set to SIG_IGN
" - if you've set up a signal handler for SIGCHLD
to be SIG_IGN
, the wait
is effectively done automatically, and therefore waitpid
won't work as the child will have already terminated (will not go through zombie state).
"See also the Linux Notes section about threads." - In Linux, threads are essentially processes. Modern linux will allow one thread to wait for children of other threads (provided they are in the same thread group - broadly parent process). If you are using Linux prior to 2.4, this is not the case. See the documentation on __WNOTHREAD
for details.
I'm guessing the thread thing is a red herring, and the problem is actually the signal handler, as this accords with your statement 'the process is killed without becoming a zombie.'