Search code examples
cprocessiofork

Creating a fork causing fgets to reread a file infinitely


I'm trying to write a program that reads through a bunch of Unix commands, and creates child processes to execute them. It has one argument that determines the maximum number of child processes I want running at one time. I redirect stdin to a separate file full of commands like this: ./a.out 3 < batchcmds. When I run the code, it winds up rereading through all the commands over and over, but when I comment out the switch statement, it runs as expected (except, of course, creating processes and running commands). I've confirmed through testing that all of the child processes terminate before reaching the end of the while loop, so the main process must be getting messed with somehow, causing it to repeat reading through a file. Here's my code:

#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>

int makearg(char*, char***);

#define BUFFERSIZE 1024

int main(int argc, char** argv) {
        // reads argument
        int maxpc = atoi(argv[1]);
        int pc = 0;

        // read from the file and assign it to a child
        char command[BUFFERSIZE];
        while(fgets(command, 128, stdin) > 0) {
                char** args;
                makearg(command, &args);

                if(pc >= maxpc) {
                        wait(NULL);
                        pc--;
                }

                switch(fork()) {   // When I comment out from here to the comment below it works as expected
                        case 0:
                                // is child
                                execvp(args[0], args);
                                printf("execv failed to run %s", command);
                                exit(0);
                                printf("child is immortal\n");
                                return -1;
                        case -1:
                                printf("failed to create child\n");
                                break;
                        default:
                                // is parent
                                pc++;
                }  // The comment below
                printf("end while loop: %x:%s", getpid(), command);
        }

        // wait for children to finish
        for(; pc > 0; pc--) wait(NULL);

        return 0;
}

int makearg(char* s, char*** args) {
        // count the tokens in the string
        int tokencount = 1;
        int slength = 1;
        for(char* c = s; *c != '\0'; c++) {
                if(*c == ' ')
                        tokencount++;
                slength++;
        }

        // set up the array of tokens
        char** tokens = (char**)malloc(tokencount*sizeof(char*));
        if(tokens == NULL) {
                return -1;
        }
        for(int i = 0; i < tokencount; i++)
                tokens[i] = (char*)malloc(slength*sizeof(char));
        if(tokens == NULL) {
                for(int i = 0; i < tokencount; i++)
                        free(tokens[i]);
                free(tokens);
                return -1;
        }

        // copy data over to the token array
        int itoken = 0;
        int ichar = 0;
        for(char* c = s; *c != '\0'; c++) {
                if(*c == ' ') {
                        tokens[itoken][ichar] = '\0';
                        itoken++;
                        ichar = 0;
                }
                else if(*c != '\n') {
                        tokens[itoken][ichar] = *c;
                        ichar++;
                }
        }
        tokens[itoken][ichar] = '\0';

        *args = tokens;
        return tokencount;
}

This really has me stumped, and I can't find anyone with a similar problem. Any help would be appreciated!


Solution

  • There are at least three problems in your program.

    One is the second argument to execvp should be a list of pointers terminated by a null pointer, but your makearg routine does not construct such a list. Two steps are required to fix this. First, malloc(tokencount*sizeof(char*)); should be malloc((tokencount+1)*sizeof(char*));. Second, near the end of the routine, mark the end of the list with a null pointer: tokens[tokencount] = 0;.

    Another is that forking will cause duplicated output. This is because C streams are often buffered, meaning functions such as printf do not immediately send output to the operating system to be written to a device. Instead, the data is written into a buffer managed by the C stream interface. It is sent to the operating system when the buffer is full or when a new-line character is written (if the stream is line buffered.) Streams connected to a terminal are typically line buffered. Streams connected to a pipe are typically fully buffered.

    When your program forks, any data put into the buffer by printf (or other routines) but not yet sent to the operating system will be duplicated in the child. Eventually, both the parent and the child may send their copy of the data to the operating system, so you will see a message twice. It is possible these duplicated messages lead you to believe the program was doing things multiple times, such as reprocessing the same command. That is not the case. Only the message was duplicated, not the prior work of the program.

    One way to fix that is to set the standard output stream not to be buffered, which you can do by executing setvbuf(stdout, NULL, _IONBF, 0); at the beginning of your program. However, that loses all the benefits of buffering (notably efficiency). Another solution is to execute fflush(stdout); immediately before a fork call. That will send all pending output to the system, so there will be no data in the buffer to be duplicated.

    A third problem is in fgets(command, 128, stdin) > 0. The C standard does not define comparing a pointer (returned by fgets) to a null pointer constant (0) with >. You can compare it with !=. Also, it is good to use NULL to make it clear you are comparing to a null pointer: fgets(command, 128, stdin) != NULL.