Search code examples
arraysprobability-theory

How to randomly chose any number of elements from array while reading it


I need to randomly (with equal probability) pick some fixed number of elements from array which is in the file. I want to read file once and just keep picked elements because an array can be very long and I don't want to keep it in memory. There should be equal probability that each subarray is chosen. And also at the beginning I don't know the size of array.

How can I do it?


Solution

  • You need something called Reservoir Sampling.

    It's explained pretty well in this blog:

    http://gregable.com/2007/10/reservoir-sampling.html