From Programming Pearls: Column 12: A Sample Problem:
The input consists of two integers m and n, with m < n. The output is a sorted list of m random integers in the range 0..n-1 in which no integer occurs more than once. For probability buffs, we desire a sorted selection without replacement in which each selection occurs with equal probability.
The author provides one solution:
initialize set S to empty
size = 0
while size < m do
t = bigrand() % n
if t is not in S
insert t into S
size++
print the elements of S in sorted order
In the above pseudocode, bigrand()
is a function returns a large random integer (much larger than m and n).
Can anyone help me prove the correctness of the above algorithm?
According to my understanding, every output should have the probability of 1/C(n, m). How to prove the above algorithm can guarantee the output with the probability of 1/C(n, m)?
Each solution this algorithm yields is valid.
How many solutions are there?
Up to last line there (sorting) are n*(n-1)*(n-2)*..*(n-m)
different permutations or
n!/(n-m)!
and each result has same probability
When you sort you reduce number of possible solutions by m!.
So number of possible outputs is n!/((n-m)!*m!)
and this is what you asked for.
n!/((n-m)!m!) = C(n,m)