I'm working on hash table in C language and I'm testing hash function for string.
The first function I've tried is to add ascii code and use modulo (% 100
) but i've got poor results with the first test of data: 40 collisions for 130 words.
The final input data will contain 8000 words (it's a dictionary stores in a file). The hash table is declared as int table[10000]
and contains the position of the word in a .txt file.
I've had nice results with djb2
by Dan Bernstein.
unsigned long
hash(unsigned char *str)
{
unsigned long hash = 5381;
int c;
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
return hash;
}