linux process operating-system fork wait

How many child processes can a parent spawn before becoming infeasible?

I'm a C programmer learning about fork(), exec(), and wait() for the first time. I'm also whiteboarding a Standard C program which will run on Linux and potentially need a lot of child processes. What I can't gauge is... how many child processes are too many for one parent to spawn and then wait upon?

Suppose my code looked like this:

pid_t status[ LARGENUMBER ];
status[0] = fork();
if( status[0] == 0 )
{
    // I am the child
    exec("./newCode01.c");
}
status[1] = fork();
if( status[1] == 0 )
{
    // child
    exec("./newCode02.c");
}
...etc...
wait(status[0]);
wait(status[1]);
...and so on....

Obviously, the larger LARGENUMBER is, the greater the chance that the parent is still fork() ing while children are segfaulting or becoming zombies or whatever.

So this implementation seems problematic to me. As I understand it, the parent can only wait() for one child at a time? What if LARGENUMBER is huge, and the time gap between running status[0] = fork(); and wait(status[0]); is substantial? What if the child has run, becomes a zombie, and been terminated by the OS somehow in that time? Will the parent then wait(status[0]) forever?

In the above example, there must be some standard or guideline to how big LARGENUMBER can be. Or is my approach all wrong?

#define LARGENUMBER 1
#define LARGENUMBER 10
#define LARGENUMBER 100
#define LARGENUMBER 1000
#define LARGENUMBER ???

I want to play with this, but my instinct is to ask for advice before I invest the development time into a program which may or may not turn out to be infeasible. Any advice/experience is appreciated.

Solution

I will try my best to explain.

First a bad example: where you fork() one child process, then wait for it to finish before forking another child process. This kills the multiprocessing degree, bad CPU utilization.

pid = fork();
if (pid == -1) { ... } // handle error
else if (pid == 0) {execv(...);} // child
else (pid > 0) {
    wait(NULL);  // parent
    pid = fork();
    if (pid == -1) { ... } // handle error
    else if (pid == 0) {execv(...);} // child
    else (pid > 0) {wait(NULL); } // parent
}

How should it be done ? In this approach, you first create the two child process, then wait. Increase CPU utilization and multiprocessing degree.

pid1 = fork();
if (pid1 == -1) { ... } // handle error
if (pid1 == 0) {execv(...);}
pid2 = fork();
if (pid2 == -1) { ... } // handle error
if (pid2 == 0) {execv(...);}
if (pid1 > 0) {wait(NULL); }
if (pid2 > 0) {wait(NULL); }

NOTE:
even though it seems as parent is waiting before the second wait is executed, the child is still running and is not waiting to execv or being spawned.

In your case, you are doing the second approach, first fork all processes and save return value of fork then wait.

the parent can only wait() for one child at a time?

The parent can wait for all its children one at a time!, whether they already finished and became zombie process or still running. For more explained details look here.

How many child processes can a parent spawn before becoming infeasible?

It might be OS dependent, but one acceptable approach is to split the time given to a process to run in 2, half for child process and half for parent process. So that processes don't exhaust the system and cheat by creating child processes which will run more than the OS wanted to give the parent process in first place.