Elaboration on the uniform randomness of selection of a key

Consider this question:- Efficiently picking a random element from a chained hash table?

And consider it's first answer. It suggests a method to select a key in a uniformly random fashion. That, however, is unclear to me. The first step would take a probability of 1/m (i.e randomly selecting a bucket from m buckets)
And the second step can be divided into two steps:
1) If k<=p, then p is returned.
2) If k>p then the loop again runs.
This is done until p is returned.
So the probability of a key being selected is :
(1/m)[(k1/L)+((L-k1)/L)[(1/m)[(k2/L)+((L-k2)/L)[(1/m)[(k3/L)+((L-k3)/L)[......and so on.
Now how can this be equal to 1/n?

Solution

This is a form of rejection-sampling.

Remark:

it looks like you splitted the two steps and loop somehow on the second step only (my interpretation of your calculation-formula)
redoing all steps every time is the most important aspect of the algorithm!
- it's the basic-idea of rejection-sampling: we are sampling from some surrounding-density and need to sample again, if the selected sample is not within the range of our sample range (that's very informal; read the above link)

Why this approach:

Imagine, there are 2 buckets, where b0 hast 2 elements, b1 has 4 elements
Step 1 is selecting one bucket uniformly
- But because b0 has a different amount of elements than b1, the actual sampling in step 2 needs to be somehow adapted to the information about the number of elements (or we will use uniformity)
- We don't have this full information, we only got the upper-bound L on all chains
- Meaning: We use the rejection-idea to sample from the max-range L; and accept if the index is compatible with the bucket. So if the selected bucket has half the amount of elements compared to the biggest one, it needs to be aborted (restart with step 1) 50% of the time. It's like inserting fake-elements to all buckets so that the amounts of elements are constant. Then sample, check if real or fake-element was selected and do that again if a fake one was hit.
- It's easy to see that: b0 get's chosen 50% of the time; equal to b1
- When sampling within b0, the process will get aborted 50% of time, because k=2, L=4 (L from the size of elements in b2)
- When sampling within b1, the process will never get aborted (k=L)
- If there would be no chance of aborting; we would select one selected element of b0 2 times as often (L / size-within-b0 = L/2) as one from b1, because the bucket is uniformly selected; but the number of elements to sample from differ.