Search code examples
calgorithmhashdictionaryhashtable

hash function for string


I'm working on hash table in C language and I'm testing hash function for string.

The first function I've tried is to add ascii code and use modulo (% 100) but i've got poor results with the first test of data: 40 collisions for 130 words.

The final input data will contain 8000 words (it's a dictionary stores in a file). The hash table is declared as int table[10000] and contains the position of the word in a .txt file.

  • Which is the best algorithm for hashing string?
  • And how to determinate the size of hash table?

Solution

  • I've had nice results with djb2 by Dan Bernstein.

    unsigned long
    hash(unsigned char *str)
    {
        unsigned long hash = 5381;
        int c;
    
        while (c = *str++)
            hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    
        return hash;
    }