Environment: Linux 2.6.32 (RHEL 6.3) on x86_64 with gcc 4.4.6
Background: I am running doing some heavy data crunching: ~500 GB input data spread over ~2000 files. My main process forks N children, each of which receives a list of filenames to crunch.
What I want is for console I/O to pass through the parent. I have been looking into pipe()
and see some fascinating stuff about using poll()
to have my parent block until there are error messages to read. It seems that I need to have N pipes (one per child) and pass poll()
information about what signals I want to listen to. Also, I think that once I dup2(pipe[1], STDOUT)
in each child, each child should be able to write to the pipe with cout << stuff;
as usual, right?
First, is what I have said above about multiple pipes, poll()
ing and dup2()
correct?
Second, how do I set up the parent poll()
loop so that I move on once all the children have died?
Right now, this (incomplete) section of code reads as follows:
int status;
while (1) { // wait for stuff
while ((status = poll(pollfds, ss.max_forks, -1)) > 1)
cout << "fork "<< status << ": " << pipes[status][0];
if (status == -1) Die(errno, "poll error");
if (status == 0) { // check that we still have at least one open fd
bool still_running = false;
for (int i=0; i<ss.max_forks; i++) {
// check pipe i and set still_running if it is not zero
}
if (!still_running)
break;
}
}
Third, what should I set and when should I set it with fcntl()? Do I want to do O_ASYNC? Do I want to do blocking or nonblocking?
Actually, you need to close() the respective "unused" side in both processes (parent and child), to make sure the "broken pipe" comes across. Thus, if the child writes into Pipe[0], then the parent will read from Pipe[1] and close its own Pipe[0]. Likewise, the child will close Pipe[1].
If you do this, the parent will get an error when it reads from the pipe after the child has died. Don't forget to use one of the waitpid()-style functions to clean up the dead processes.
You might want to sett the handles to nonblocking, so you can just read whatever is there without having to use 1-byte-reads which would be horribly inefficient. Although I just make one call to read() with a suitable buffersize (usually 1024 or 4096), and just let the next poll trigger if there's more data. But then, I usually just have one child to work with, not a few hundred :-)
As for your loop, you'll have to track the state of each child, and exit when you have no live children left.
EDIT: actually, I find that I assume the child is dead when I get a 0-byte read even though POLLIN was set, or when I get POLLERR or POLLHUP flags. Not sure which case is the correct one...