To retrieve k random numbers from an array of undetermined size we use a technique called reservoir sampling. Can anybody briefly highlight how it happens with a sample code?
I actually did not realize there was a name for this, so I proved and implemented this from scratch:
def random_subset(iterator, K):
result = []
N = 0
for item in iterator:
N += 1
if len(result) < K:
result.append(item)
else:
s = int(random.random() * N)
if s < K:
result[s] = item
return result
With proof near the end.