Search code examples
randomnormal-distributioncoin-flipping

Dealing with normal (Gaussian) distribution


I basically stucked on rather simple problem:

Toss N coins and discover, how many of them land heads

Solution performance must not depend on N, so we can't just call Math.random() < 0.5 N times. Obviously, there is Gaussian distribution to the rescue.

I used Box-Muller method for it:

function gaussian_random(mean, variance) {
  var s;
  var x;
  var y;
  do {
    x = Math.random() * 2.0 - 1.0;
    y = Math.random() * 2.0 - 1.0;
    s = Math.pow(x, 2) + Math.pow(y, 2);
  } while ( (s > 1) || (s == 0) );

  var gaussian = x * Math.sqrt(-2*Math.log(s)/s);
  return mean + gaussian * Math.sqrt(variance);
}

Math says, that mean of N coin tosses is N/2 and variance is N/4

Then, I made test, which tosses N coins M times, giving Minimum, Maximum, and Average number of heads.

I compared results of naive approach (Math.random() many times) and Gaussian Box-Muller approach.

There is typical output of tests:

Toss 1000 coins, 10000 times
Straight method: 
Elapsed time: 127.330 ms
Minimum: 434
Maximum: 558
Average: 500.0306
Box-Muller method: 
Elapsed time: 2.575 ms
Minimum: 438.0112461962819
Maximum: 562.9739632480057
Average: 499.96195358695064

Toss 10 coins, 10000 times
Straight method: 
Elapsed time: 2.100 ms
Minimum: 0
Maximum: 10
Average: 5.024
Box-Muller method: 
Elapsed time: 2.270 ms
Minimum: -1.1728354576573263
Maximum: 11.169478925333504
Average: 5.010078819562535

As we can see on N = 1000 it fits almost perfectly.

BUT, on N = 10 Box-Muller goes crazy: it allows such tests outcomes, where i can get (quite rarely, but it is possible) 11.17 heads from 10 coin tosses! :)

No doubt, i am doing something wrong. But what exactly?

There is source of test, and link to launch it

Updated x2: it seems, previously I didn't describe problem well. There is general version of it:

How to get sample mean of N uniform random values (either discrete or continuous) in amortized constant time. Gaussian distribution is efficient for large N, but is there a way to make it work good on small N? Or on which exactly N, solution should switch from Gaussian method to some other (for exampled straightforward).


Solution

  • Math says, that mean of N coin tosses is N/2 and variance is N/4.

    Math only says that if it's a fair coin. And there's no way the solution doesn't depend on N.

    Assuming all tosses are independent of each other, for exact behaviors use a binomial distribution rather than a normal distribution. Binomial has two parameters: N is the number of coin tosses, and p is the probability of getting heads (or tails if you prefer). In pseudocode...

    function binomial(n, p) {
      counter = 0
      successes = 0
      while counter < n {
        if Math.random() <= p
           successes += 1
        counter += 1
      }
      return successes
    }
    

    There are faster algorithms for large N, but this is straightforward and mathematically correct.