Search code examples
creadfilefgetsstrtok

While reading from a file and using strtok while loop only prints first word of each line


I am a Java programmer testing my luck at C. I am trying to read a file in line by line and then count each individual word. So far I am not having luck separating each line into words. I am able to see each line and loop through the file correctly but my output is only the first word of each line. What am I doing wrong here?

char printword[1024]= "";

void print() {
    printf("%s", printword);    
}
main()
{
    FILE* f;
    errno_t err;
    err = fopen_s(&f, FILE_NAME, "r");
    if (&f == NULL) {
    exit(EXIT_FAILURE);
    }
    char line[1024];
    while (fgets(line, 1024, f) != NULL) {
        char * word;
        char *context = " ";
        word = strtok(line, " ");
        while (word != NULL) {
        strcpy(printword, strcat(word," "));
        print();
        word = strtok(NULL, " ");
        }
        printf("\n", NULL);
    }
    //}
    fclose(f);
    printf("Press any key to continue");
    getchar();
    exit(0);
}

Solution

  • @BlueStrat appears to have put his finger on the issue with his comment.

    When using strtok(), you must always remember that it does not allocate any memory, but instead returns pointers into the original string (inserting terminators in place of delimiters), and maintains an internal static pointer to the start of the next token. Suppose, then, that the first line of your input file contains

    one two three
    

    fgets() will read that into your line array:

           0         1
    offset 0123456789012 3
    line   one two three\0
    

    The first strtok() call returns a pointer to the character at offset 0, sets the character at offset 3 to a terminator, and sets its internal state variable to point to the character at offset 4:

           0         1
    offset 012 3456789012 3
    line   one\0two three\0
           ^    ^
           |    |
           |    +-- (next)
           +------- word 
    

    Then you strcat an extra character onto the end of word, producing:

           0         1
    offset 0123 456789012 3
    line   one \0wo three\0
           ^    ^
           |    |
           |    +-- (next)
           +------- word 
    

    Now study that for a moment. Not only have you corrupted the data following the first token, you have done it in such a way that the internal state pointer points to a string terminator. When you next call strtok(), then, that function sees that it is at the end of the string (a string), and returns NULL to signal that there are no more tokens.

    Instead of manipulating the token, which is perilous, concatenate its contents to the printword buffer and then concatenate the extra space to that.