Self-Correcting Probability Distribution - Maintain randomness, while gravitating each outcome's frequency towards its probability

This is a common problem when you want to introduce randomness, but at the same time you want your experiment to stick close to the intended probability distribution, and can not / do not want to count on the law of big numbers.

Say you have programmed a coin with 50-50 chance for heads / tails. If you simulate it 100 times, most likely you will get something close to the intended 50-50 (binary distribution centered at 50-50).

But what if you wanted similar certainty for any number of repeats of the experiment.

A client of ours asked us this ::

We may also need to add some restrictions on some of the randomizations (e.g. if spatial location of our stimuli is totally random, the program could present too many stimuli in some locations and not very many in others. Locations should be equally sampled, so more of an array that is shuffled instead of randomization with replacement).

So they wanted randomness they could control.

Solution

Implementation details aside (arrays, vs other methods), the wanted result for our client's problem was the following ::

Always have as close to 1 / N of the stimuli in each of the N potential locations, yet do so in a randomized (hard-to-predict) way.

This is commonly needed in games (when distributing objects, characters, stats, ..), and I would imagine many other applications.

My preferred method for dealing with this is to dynamically weight the intended probabilities based on how the experiment has gone so far. This effectively moves us away from independently drawn variables.

Let p[i] be the wanted probability of outcome i
Let N[i] be the number of times outcome i has happened up to now
Let N be the sum of N[] for all outcomes i
Let w[i] be the correcting weight for i
Let W_Max be the maximum weight you want to assign (ie. when an outcome has occurred 0 times)
Let P[i] be the unnormalized probability for i
Then p_c[i] is the corrected probability for i

p[i] is fixed and provided by the design. N[i] is an accumulation - every time i happens, increment N[i] by 1.

w[i] is given by

w[i] = CalculateWeight(p[i], N[i], N, W_Max)
{
    if (N == 0) return 1;
    if (N[i] == 0) return W_Max;

    intended = p[i] * N
    current = N[i]

    return intended / current;
}

And P[i] is given by

P[i] = p[i] * w[i]

Then we calculate p_c[i] as

p_c[i] = P[i] / sum(P[i])

And we run the next iteration of our random experiment (sampling) with p_c[i] instead of p[i] for outcome i.

The main drawback is that you trade control for predictability. After 4 tails in a row, it's highly likely you will see a head.

Note 1 :: The described method will provide at any step a distribution close to the original if the experiment's results match the intended results, or skewed towards (away) outcomes that have happened less (more) than intended.

Note 2 :: You can introduce a "control" parameter c and add an extra step.

p_c2[i] = c * p_c[i] + (1-c) * p[i]

For c = 1, this defaults to the described method, for c = 0 it defaults to the the original probabilities (independently drawn variables).