Search code examples
cvalgrind

Valgrind: Conditional jump or move depends on uninitialised value(s) when using hash function


I implemented the djb2 hash function like this:

unsigned int hash(const char *word)
{
    unsigned int hash = 5381;
    int c;

    while ((c = *word++))
    {
        hash = ((hash << 5) + hash) + c;
    }

    return hash % N;
}

Elsewhere, I'm using the hash function via this function:

bool check_word(const char *word)
{
    const char *word_lower = strlwr(word); // need word_lower because hash function is case sensitive
    node *iterator = table[hash(word_lower)];
    free((void*) word_lower);
    while (iterator != NULL) // traverse linked list, looking for the given word via strcasecmp
    {
        if (strcasecmp(iterator->word, word) == 0)
        {
            return true;
        }
        iterator = iterator->next;
    }
    return false;
}

And also this function:

void fill_hash_table(const char *dictionary)
{
    FILE *dict_ptr = fopen(dictionary, "r");
    if (dict_ptr == NULL)
    {
        return;
    }

    // prepare char array for every word with size LENGTH + 1 because LENGTH is the guaranteed max length
    char curr_word[LENGTH + 1];
    while (fscanf(dict_ptr, "%s", curr_word) != EOF)
    {
        [...]
        unsigned int table_pos = hash(curr_word);
        [...]
    }
    [...]
}

where dictionary represents a text file that contains line separated strings, like so:

a
ab
abc

Running Valgrind yields Conditional jump or move depends on uninitialized value(s), referring to while ((c = *word++)), or word, to be more specific.

Is there a way to avoid this?


The strlwr() function is implemented like this:

// returns same string but lower-cased
const char *strlwr(const char *string)
{
    char *string_to_lower = malloc(LENGTH + 1);
    for (int i = 0; string[i]; i++)
    {
        string_to_lower[i] = tolower(string[i]);
    }
    return string_to_lower;
}

Solution

  • This:

    while ((c = *word++))
    

    can only cause such a warning if you are passing a word that is not a correctly initialized and NUL terminated string.

    The problem in your code is most likely caused by your strlwr() function, which is not correctly NUL-terminating the string. You exit the for loop at the terminator, but fail to add it to the resulting string.

    The correct code would be:

    const char *strlwr(const char *string)
    {
        char *string_to_lower = malloc(LENGTH + 1);
        unsigned i;
    
        for (i = 0; string[i]; i++)
        {
            string_to_lower[i] = tolower(string[i]);
        }
    
        string_to_lower[i] = '\0'; // Ensure NUL terminator!
        return string_to_lower;
    }
    

    Secondly, I would suggest you to modify this:

    while (fscanf(dict_ptr, "%s", curr_word) != EOF)
    

    You're using %s as a format specifier, which is asking for trouble. You cannot guarantee that the data that is being read will not overflow the buffer.

    Use a correct format specifier that includes the buffer length, like this:

    fscanf(dict_ptr, "%45s", curr_word);
    

    Or, better, use fgets(), which was exactly designed to read strings in a safe way:

    fgets(cur_word, LENGTH, dict_ptr);
    

    Lastly:

    • You should check the return value of malloc().
    • You should avoid casting the pointer passed to free like this: free((void*) word_lower). Any pointer is automatically converted from/to void*. The cast only hides a potential error in cases where the variable is not a pointer.
    • You should use unsigned (or even size_t) instead of int if it does not make sense for the value to become negative.