I'm trying to write a program that reads through a bunch of Unix commands, and creates child processes to execute them. It has one argument that determines the maximum number of child processes I want running at one time. I redirect stdin
to a separate file full of commands like this: ./a.out 3 < batchcmds
. When I run the code, it winds up rereading through all the commands over and over, but when I comment out the switch statement, it runs as expected (except, of course, creating processes and running commands). I've confirmed through testing that all of the child processes terminate before reaching the end of the while loop, so the main process must be getting messed with somehow, causing it to repeat reading through a file. Here's my code:
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int makearg(char*, char***);
#define BUFFERSIZE 1024
int main(int argc, char** argv) {
// reads argument
int maxpc = atoi(argv[1]);
int pc = 0;
// read from the file and assign it to a child
char command[BUFFERSIZE];
while(fgets(command, 128, stdin) > 0) {
char** args;
makearg(command, &args);
if(pc >= maxpc) {
wait(NULL);
pc--;
}
switch(fork()) { // When I comment out from here to the comment below it works as expected
case 0:
// is child
execvp(args[0], args);
printf("execv failed to run %s", command);
exit(0);
printf("child is immortal\n");
return -1;
case -1:
printf("failed to create child\n");
break;
default:
// is parent
pc++;
} // The comment below
printf("end while loop: %x:%s", getpid(), command);
}
// wait for children to finish
for(; pc > 0; pc--) wait(NULL);
return 0;
}
int makearg(char* s, char*** args) {
// count the tokens in the string
int tokencount = 1;
int slength = 1;
for(char* c = s; *c != '\0'; c++) {
if(*c == ' ')
tokencount++;
slength++;
}
// set up the array of tokens
char** tokens = (char**)malloc(tokencount*sizeof(char*));
if(tokens == NULL) {
return -1;
}
for(int i = 0; i < tokencount; i++)
tokens[i] = (char*)malloc(slength*sizeof(char));
if(tokens == NULL) {
for(int i = 0; i < tokencount; i++)
free(tokens[i]);
free(tokens);
return -1;
}
// copy data over to the token array
int itoken = 0;
int ichar = 0;
for(char* c = s; *c != '\0'; c++) {
if(*c == ' ') {
tokens[itoken][ichar] = '\0';
itoken++;
ichar = 0;
}
else if(*c != '\n') {
tokens[itoken][ichar] = *c;
ichar++;
}
}
tokens[itoken][ichar] = '\0';
*args = tokens;
return tokencount;
}
This really has me stumped, and I can't find anyone with a similar problem. Any help would be appreciated!
There are at least three problems in your program.
One is the second argument to execvp
should be a list of pointers terminated by a null pointer, but your makearg
routine does not construct such a list. Two steps are required to fix this. First, malloc(tokencount*sizeof(char*));
should be malloc((tokencount+1)*sizeof(char*));
. Second, near the end of the routine, mark the end of the list with a null pointer: tokens[tokencount] = 0;
.
Another is that forking will cause duplicated output. This is because C streams are often buffered, meaning functions such as printf
do not immediately send output to the operating system to be written to a device. Instead, the data is written into a buffer managed by the C stream interface. It is sent to the operating system when the buffer is full or when a new-line character is written (if the stream is line buffered.) Streams connected to a terminal are typically line buffered. Streams connected to a pipe are typically fully buffered.
When your program forks, any data put into the buffer by printf
(or other routines) but not yet sent to the operating system will be duplicated in the child. Eventually, both the parent and the child may send their copy of the data to the operating system, so you will see a message twice. It is possible these duplicated messages lead you to believe the program was doing things multiple times, such as reprocessing the same command. That is not the case. Only the message was duplicated, not the prior work of the program.
One way to fix that is to set the standard output stream not to be buffered, which you can do by executing setvbuf(stdout, NULL, _IONBF, 0);
at the beginning of your program. However, that loses all the benefits of buffering (notably efficiency). Another solution is to execute fflush(stdout);
immediately before a fork
call. That will send all pending output to the system, so there will be no data in the buffer to be duplicated.
A third problem is in fgets(command, 128, stdin) > 0
. The C standard does not define comparing a pointer (returned by fgets
) to a null pointer constant (0
) with >
. You can compare it with !=
. Also, it is good to use NULL
to make it clear you are comparing to a null pointer: fgets(command, 128, stdin) != NULL
.