I have seen the answers on here stating to use gperf, however, I would prefer to roll my own based on the proof that I create for the domain of strings
with a fixed length of <= 200
Based on the calculations I have from wolfram I get ~7.9 x 10^374
total permutations. Therefore my line of thinking is if I have a 2048
bit hash function (3.2 x 10^616
) I should be able to handle the entire universe of strings that I need to process. My question is how can I prove that the hash implementation I end up producing will be perfect given the constraint of the universe of all strings of length 200 or less?
Strings with a length of 200 characters only have 200 * 8 = 1600 bits. If a 2048 bit hash is OK for your purpose, you could just use the string bits as a perfect hash. The identity hash function is perfect, as it maps each input to a distinct hash value (obviously, because there is no mapping).