Search code examples
hashcryptographyhashcode

Many to one hash function


I don't know exactly what to call it and this is the closest thing that I can think of

What I'm looking for is a function that takes an array of strings and returns a hash string like the following

const inputs = [str_1, str_2 ...... str_n]

const hash = generate_hash(inputs)

And then another function that takes 2 arguments, the previously generated hash and a random string and returns true or false if the provided string is a member of the inputs array that is used to generate the hash

const random_str = "random"

const is_in_inputs = check_str(hash, random_str)

This could even be an encryption algorithm with multiple keys and only those keys could decrypt the encrypted string

I don't care about the security and stuff, I just want something that gets the job done

If such thing already exists then point me in the right direction and if not then tell me how can I implement it


Solution

  • What you're looking for is called a Bloom Filter. It will never return a false negative (i.e. if it says "random" is not in the hash, it isn't), but i can return false positives.

    All fixed-sized structures that do this will be probabilistic. They will either have false positives or false negatives. (You can get false negatives without false positives just by making this a cache instead of a hash.) The only way to make this fully deterministic (no false results) is to let it grow without bound. The most obvious implementation would be to have generate_hash unambiguously encode all the elements (serialize to length-value for example) and then gzip the result. But this is no longer a "hash." It's just an encoding.

    If it were possible to do this in fixed space, you could use it to infinitely compress data, which is impossible. Like with so many things in this space, the pigeonhole principle stops us.