Search code examples
cdata-structureslinked-listhashcodehash-function

I don't understand this hash-function code


Can anyone please explain how this hash function work? I have spent a lot of time trying to figure it out and still don't know how it works.

Full code is from https://gist.github.com/choaimeloo/ffb96f7e43d67e81f0d44c08837f5944#file-dictionary-c-L30

// Hashes the word (hash function posted on reddit by delipity)
// The word you want to hash is contained within new node, arrow, word.
// Hashing that will give you the index. Then you insert word into linked list.

int hash_index(char *hash_this)
{
    unsigned int hash = 0;
    for (int i = 0, n = strlen(hash_this); i < n; i++)
    {
        hash = (hash << 2) ^ hash_this[i];
    }
    return hash % HASHTABLE_SIZE;
}

I dont understand why he uses (<< and ^) ?

Also why did he use strlen(hash_this)?


Solution

  • The purpose of a hash function ist to retrieve a unique value for a given sequence (of bytes, characters, ...).

    Therefore you need the length of the sequence, here with 'strlen'.

    Without bit shift operator (<<) you would get the same result for the sequence 'abc' and 'cba'.

    The xor operator (^) 'scrambles' / 'hashes' the current value further, so it becomes more unlikley, that similar sequences results in an equivalent value (imagine sequences with a certain pattern, like 'abcabc...').