Search code examples
algorithmhashhashtableprobabilityclrs

MIT Lecture WRONG? Analysis of open addressing in hashing


In the following MIT lecture: https://www.youtube.com/watch?v=JZHBa-rLrBA at 1:07:00 ,professor taught to calculate number of probes in unsuccessful search.

But my method of calculating doesn't matches his. My answer is:

Number of probes=enter image description here

m= no. of slots in hash table

n= no. of elements (keys)

Explanation:

1.The hash function can hit an empty slot with probability m-n/m.

2.Or it can hit a preoccupied key slot with probability n/m.

3.Now in case 2, we will have to again call hash function and there are two chances: (i) We get a slot with no key with probability (m-n)/(m-1). (ii) We get a slot with key with probability (n-1)/(m-1).

4.Now repeat case 3 but with different probabilities as shown in the image

Why am I getting different answer. What's wrong with it?


Solution

  • The problem asks us to find the expected number of probes that need to be done in a hash table.

    You must do one no matter what, so you have 1 to start with. Then, there is an n / m chance that you have a collision. You got this right in your explanation.

    If you have a collision, you must do another probe (and maybe even more). And so on, so the answer is the one the professor gets:

    1 + (n / m)(1 + ((n - 1) / (m - 1))(1 + ...))
    

    You don't multiply with the probability that you get an empty slot. You multiply the probability of not getting an empty slot with the number of operations you have to do if you don't get an empty slot (1, because you have to do at least one more probe in that case).

    It is meaningless to multiply the probability of getting an open slot with the probability of not getting one, like you're doing. Remember that we want to find the expected number of probes that we need to do. So you multiply the number of operations (probes) at each step with the probability that you don't get what you'd ideally like to get (an empty slot), because if this event happens, then we'll have to do more operations (probes), otherwise we're done.

    This is explained very well in the lecture you linked to if you watch carefully until the end.