Search code examples
numpysimulationprobabilityvariance

Coin Toss: Frequency of trials with Expected Value out of n trials


I am simulating the probability of tossing tails in 10 coin tosses, and running that game n times. say

n = 100, total_tosses = n * 10 = 10000

n = 1000, total_tosses = n * 10 = 100000

n = 100000, total_tosses = n * 10 = 1000000

I know the expected value of a coin toss is 0.5

Out of 10 trials, I expect 5/10 tails

But simulating the 10 trials n times yields some interesting results, that I can't wrap my head around...

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

# will return 1d array of 10 coin tosses in a single trial.
# = [1,0, ... 1,1] len = 10
def coin_game(num_flips):
    coin_tosses = []
    for x in range(num_flips):
        coin = np.random.randint(2)
        coin_tosses.append(coin)
    return coin_tosses

# will return 1d array with total num of tails, for each of the n trials.
# [3,5,2, ... 8,9,1] len = n
def run_sims(num_sims):
    num_tails = []
    for sim in range(num_sims):
        coin_tosses = coin_game(10)
        num_tails.append(sum(coin_tosses))
    return np.array(num_tails)

# ---Main---
num_trials = 10000
all_tails = run_sims(num_trials)
sns.countplot(all_tails)
plt.show()

Why is relationship between the number of total trials and the frequency of trials that show the expected value, aka 5/10 coin tosses are tails.

For 1000 trials: About 250 trials have 5/10 tails

For 10000 trials: About 2500 trials have 5/10 tails

For 100000 trials: About 25000 trials have 5/10 tails

What is causing this behavior?

Roughly, why is freq(5/10 tails) = n/4


Solution

  • This is just basic probability (more specifically a binomial distribution). You have a 2^10 possible outcomes, and 252 of those outcomes are "successes" (contain 5 tails). That's why you're seeing roughly n/4 of these outcomes.

    enter image description here

    In a more general sense, you can solve this using the following formula:

    enter image description here

    Where n is the number of trials, k is the number of successes, and p is the probability of success.

    For your question, this works out to be:

    (10! / 5!(10 - 5)!) * (1 / 2)^5 * (1 - 1/2)^5 == 0.24609375