Search code examples
algorithmhashtags32-bit

What is the best 32bit hash function for short strings (tag names)?


What is the best 32bit hash function for relatively short strings?

Strings are tag names that consist of English letters, numbers, spaces and some additional characters (#, $, ., ...). For example: Unit testing, C# 2.0.

I am looking for 'best' as in 'minimal collisions', performance is not important for my goals.


Solution

  • If performance isn't important, simply take a secure hash such as MD5 or SHA1, and truncate its output to 32 bits. This will give you a distribution of hash codes that's indistinguishable from random.