Can anyone please explain how this hash function work? I have spent a lot of time trying to figure it out and still don't know how it works.
Full code is from https://gist.github.com/choaimeloo/ffb96f7e43d67e81f0d44c08837f5944#file-dictionary-c-L30
// Hashes the word (hash function posted on reddit by delipity)
// The word you want to hash is contained within new node, arrow, word.
// Hashing that will give you the index. Then you insert word into linked list.
int hash_index(char *hash_this)
{
unsigned int hash = 0;
for (int i = 0, n = strlen(hash_this); i < n; i++)
{
hash = (hash << 2) ^ hash_this[i];
}
return hash % HASHTABLE_SIZE;
}
I dont understand why he uses (<< and ^) ?
Also why did he use strlen(hash_this)?
The purpose of a hash function ist to retrieve a unique value for a given sequence (of bytes, characters, ...).
Therefore you need the length of the sequence, here with 'strlen'.
Without bit shift operator (<<) you would get the same result for the sequence 'abc' and 'cba'.
The xor operator (^) 'scrambles' / 'hashes' the current value further, so it becomes more unlikley, that similar sequences results in an equivalent value (imagine sequences with a certain pattern, like 'abcabc...').