Creating random hash function that returns a set

Consider the following code:

def xorHash(n):
    mask = random.getrandbits(32)
    def xorIt(x):
        return (hash(x) ^ mask) % n
    return xorIt

This returns a random hash function which maps elements into a number in {0,1,...,rng-1}.

I want to create a random hash function that maps each element into exactly k elements in {0,1,...,rng-1} (without repetitions). The example above does the job for k=1.

What is the most efficient way of creating a random hash function which returns a k-sized random subset of {0,1,...,rng-1}?

Solution

Seed an RNG with an ordinary integer-valued randomized hash of your data and use it to draw a random sample from the desired range:

def generate_randomized_set_valued_hash_function(n, k):
    hashfunc = generate_randomized_hash_function()
    def set_valued_hashfunc(x):
        rng = random.Random(hashfunc(x))
        return set(rng.sample(xrange(n), k))
    return set_valued_hashfunc

What RNG and what integer-valued hash function you choose will depend on how strong and how fast you need your set-valued hash function to be.