Search code examples
cfilegcciostdio

Read a text file into an array in plain C


Is there a way to read a text file into a one dimensional array in plain C? Here's what I tried (I am writing hangman):

int main() {
    printf("Welcome to hangman!");

    char buffer[81];
    FILE *dictionary;
    int random_num;
    int i;
    char word_array[80368];

    srand ( time(NULL) );

    random_num = rand() % 80368 + 1;
    dictionary = fopen("dictionary.txt", "r");

    while (fgets(buffer, 80, dictionary) != NULL){
        printf(buffer); //just to make sure the code worked;
        for (i = 1; i < 80368; i++) {
            word_array[i] = *buffer;
        }
    }

    printf("%s, \n", word_array[random_num]);
    return 0;
}

What's wrong here?


Solution

  • Try changing a couple of things;

    First; you're storing a single char. word_array[i] = *buffer; means to copy a single character (the first one on the line/in the buffer) into each (and every) single-char slot in word_array.

    Secondly, your array will hold 80K characters, not 80K words. Assuming that that's the length of your dictionary file, you can't fit it all in there using that loop.

    I'm assuming you have 80,368 words in your dictionary file. That's about 400,000 words less than /usr/share/dict/words on my workstation, though, but sounds like a reasonable size for hangman…

    If you want a one-dimensional array intentionally, for some reason, you'll have to do one of three things:

    • pretend you're on a mainframe, and use 80 chars for every word:

        char word_array[80368 * 80];
      
      memcpy (&(word_array[80 * i]), buffer, 80);
      
    • create a parallel array with indices to the start of each line in a huge buffer

         int last_char = 0;
         char* word_start[80368];
         char word_array[80368 * 80];
         for ( … i++ ) {
             memcpy (&word_array[last_char], buffer, strlen(buffer));
             word_start[i] = last_char;
             last_char += strlen(buffer);
         }
      
    • switch to using an array of pointers to char, one word per slot.

        char* word_array[80368];
      
        for (int i = 0; i < 80368, i++) {
             fgets (buffer, 80, dictionary);
             word_array[i] = strdup (buffer);
        }
      

    I'd recommend the latter, as otherwise you have to guess at the max size or waste a lot of RAM while reading. (If your average word length is around 4-5 chars, as in English, you're on average wasting 75 bytes per word.)

    I'd also recommend dynamically allocating the word_array:

       int max_word = 80368;
       char** word_array = malloc (max_word * sizeof (char*));
    

    … which can lead you to a safer read, if your dictionary size ever were to change:

       int i = 0;
       while (1) {
            /* If we've exceeded the preset word list size, increase it. */
            if ( i > max_word ) {
                max_word *= 1.2; /* tunable arbitrary value */
                word_array = realloc (word_array, max_word * sizeof(char*));
            }
            /* Try to read a line, and… */
            char* e = fgets (buffer, 80, dictionary);
            if (NULL == e) { /* end of file */
                /* free any unused space */
                word_array = realloc (word_array, i * sizeof(char*));
                /* exit the otherwise-infinite loop */
                break;
            } else {
                /* remove any \r and/or \n end-of-line chars */
                for (char *s = &(buffer[0]); s < &(buffer[80]); ++s) {
                   if ('\r' == *s || '\n' == *s || '\0' == *s) {
                      *s = '\0'; break;
                   }
                }
                /* store a copy of the word, only, and increment the counter.
                 * Note that `strdup` will only copy up to the end-of-string \0,
                 * so you will only allocate enough memory for actual word
                 * lengths, terminal \0's, and the array of pointers itself. */
                *(word_array + i++) = strdup (buffer);
            }
        }
        /* when we reach here, word_array is guaranteed to be the right size */
        random = rand () % max_word;
        printf ("random word #%d: %s\n", random, *(word_array + random));
    

    Sorry, this is posted in an hurry, so I haven't tested the above. Caveat emptor.