I implemented the djb2 hash function like this:
unsigned int hash(const char *word)
{
unsigned int hash = 5381;
int c;
while ((c = *word++))
{
hash = ((hash << 5) + hash) + c;
}
return hash % N;
}
Elsewhere, I'm using the hash function via this function:
bool check_word(const char *word)
{
const char *word_lower = strlwr(word); // need word_lower because hash function is case sensitive
node *iterator = table[hash(word_lower)];
free((void*) word_lower);
while (iterator != NULL) // traverse linked list, looking for the given word via strcasecmp
{
if (strcasecmp(iterator->word, word) == 0)
{
return true;
}
iterator = iterator->next;
}
return false;
}
And also this function:
void fill_hash_table(const char *dictionary)
{
FILE *dict_ptr = fopen(dictionary, "r");
if (dict_ptr == NULL)
{
return;
}
// prepare char array for every word with size LENGTH + 1 because LENGTH is the guaranteed max length
char curr_word[LENGTH + 1];
while (fscanf(dict_ptr, "%s", curr_word) != EOF)
{
[...]
unsigned int table_pos = hash(curr_word);
[...]
}
[...]
}
where dictionary
represents a text file that contains line separated strings, like so:
a
ab
abc
Running Valgrind yields Conditional jump or move depends on uninitialized value(s)
, referring to while ((c = *word++))
, or word
, to be more specific.
Is there a way to avoid this?
The strlwr()
function is implemented like this:
// returns same string but lower-cased
const char *strlwr(const char *string)
{
char *string_to_lower = malloc(LENGTH + 1);
for (int i = 0; string[i]; i++)
{
string_to_lower[i] = tolower(string[i]);
}
return string_to_lower;
}
This:
while ((c = *word++))
can only cause such a warning if you are passing a word
that is not a correctly initialized and NUL terminated string.
The problem in your code is most likely caused by your strlwr()
function, which is not correctly NUL-terminating the string. You exit the for
loop at the terminator, but fail to add it to the resulting string.
The correct code would be:
const char *strlwr(const char *string)
{
char *string_to_lower = malloc(LENGTH + 1);
unsigned i;
for (i = 0; string[i]; i++)
{
string_to_lower[i] = tolower(string[i]);
}
string_to_lower[i] = '\0'; // Ensure NUL terminator!
return string_to_lower;
}
Secondly, I would suggest you to modify this:
while (fscanf(dict_ptr, "%s", curr_word) != EOF)
You're using %s
as a format specifier, which is asking for trouble. You cannot guarantee that the data that is being read will not overflow the buffer.
Use a correct format specifier that includes the buffer length, like this:
fscanf(dict_ptr, "%45s", curr_word);
Or, better, use fgets()
, which was exactly designed to read strings in a safe way:
fgets(cur_word, LENGTH, dict_ptr);
Lastly:
malloc()
.free
like this: free((void*) word_lower)
. Any pointer is automatically converted from/to void*
. The cast only hides a potential error in cases where the variable is not a pointer.unsigned
(or even size_t
) instead of int
if it does not make sense for the value to become negative.