Search code examples
clinked-listhashtablecs50

Can someone check the logic of my code that implements a hash table?


I'm trying to implement a program that takes in a .txt file that is filled with words from the dictionary, the goal of the load function is to open the file then read through it, and store all the words in a hash table. I am quite new to data structures, so I am unsure about my logic that loads the word in the hash table.

// Represents a node in a hash table
typedef struct node {
    char word[LENGTH + 1];
    struct node *next;
} node;

// Number of buckets in hash table
//                  N = 2 ^ 13
const unsigned int N = 8192;

// Hash table
node *table[N];

This is the load function which is responsible for reading all the words from the dictionary file then loading them all into the hash table wherein their index will be decided when they are passed to the hash function djb2 taken from the internet.

// Loads dictionary into memory, returning true if successful else false
bool load(const char *dictionary) {
    // TODO
    const int max_len = LENGTH + 1;
    char words[max_len];
    int index;

    // Initialize the array to NULL
    for (int i = 0; i < N; i++) {
        table[i] = NULL;
    }

    // Open the dictionary file
    FILE indata = fopen(dictionary, "r")
    if (indata == NULL) {
        return false;
    }

    // loop to get the words from the file
    while (fgets(words, max_len, indata) != NULL) {
        // get the index of the word using the hash function
        index = hash(words);

        // create a new node and check if the computer has memory
        node *newNode = malloc(sizeof(node));
        if (newNode == NULL) {
            return false;
        }

        if (table[index] == NULL) {
            newNode->word = words;
            newNode->next = NULL;
            table[index] = newNode;
        } else {
            newNode->word = words;
            newNode->next = table[index];
            table[index] = newNode;
        }
    }
    return true;
}

Solution

  • There are multiple problems in your code:

    • the array definition in load() uses the C99 variable length array syntax. Some compilers will not accept this syntax. Just write char words[LENGTH + 1]; and pass sizeof words as the size argument to fgets();
    • the array word[] is defined with a size of MAXLENGTH+1. This will pose problems for words with MAXLENGTH or more bytes from the file as fgets() will not read the full line. You should probably make the array larger and test if input lines get truncated.
    • you do not strip the trailing newline left in the array by fgets(), so your entries in the hash table will have a trailing newline, which is probably not what you intend.
    • you cannot assign an array to an array: either make the nodes hold a pointer to the string or use strcpy() to copy the word.
    • there is no need to test if there is already a node in the bucket for the hash index: you can just write:

            strcpy(newNode->word, words);
            newNode->next = table[index];
            table[index] = newNode;
      
    • you forgot to close indata before returning from load().