Search code examples
pythonprobabilitymontecarlo

Q: Expected number of coin tosses to get N heads in a row, in Python. My code gives answers that don't match published correct ones, but unsure why


I'm trying to write Python code to see how many coin tosses, on average, are required to get a sequences of N heads in a row.

The thing that I'm puzzled by is that the answers produced by my code don't match ones that are given online, e.g. here (and many other places) https://math.stackexchange.com/questions/364038/expected-number-of-coin-tosses-to-get-five-consecutive-heads

According to that, the expected number of tosses that I should need to get various numbers of heads in a row are: E(1) = 2, E(2) = 6, E(3) = 14, E(4) = 30, E(5) = 62. But I don't get those answers! For example, I get E(3) = 8, instead of 14. The code below runs to give that answer, but you can change n to test for other target numbers of heads in a row.

What is going wrong? Presumably there is some error in the logic of my code, but I confess that I can't figure out what it is.

You can see, run and make modified copies of my code here: https://trinket.io/python/17154b2cbd

Below is the code itself, outside of that runnable trinket.io page. Any help figuring out what's wrong with it would be greatly appreciated!

Many thanks,

Raj P.S. The closest related question that I could find was this one: Monte-Carlo Simulation of expected tosses for two consecutive heads in python However, as far as I can see, the code in that question does not actually test for two consecutive heads, but instead tests for a sequence that starts with a head and then at some later, possibly non-consecutive, time gets another head.

# Click here to run and/or modify this code:
# https://trinket.io/python/17154b2cbd

import random
# n is  the target number of heads in a row
# Change the value of n, for different target heads-sequences
n = 3  
possible_tosses = [ 'h', 't' ]
num_trials = 1000
target_seq = ['h' for i in range(0,n)]
toss_sequence = []
seq_lengths_rec = []

for trial_num in range(0,num_trials):

    if (trial_num % 100) == 0:
        print 'Trial num', trial_num, 'out of', num_trials
        # (The free version of trinket.io uses Python2)

    target_reached = 0
    toss_num = 0

    while target_reached == 0:

        toss_num += 1
        random.shuffle(possible_tosses)
        this_toss = possible_tosses[0]
        #print([toss_num, this_toss])
        toss_sequence.append(this_toss)
        last_n_tosses = toss_sequence[-n:]
        #print(last_n_tosses)
    if last_n_tosses == target_seq:
        #print('Reached target at toss', toss_num)
        target_reached = 1
        seq_lengths_rec.append(toss_num)

print 'Average', sum(seq_lengths_rec) / len(seq_lengths_rec)

Solution

  • You don't re-initialize toss_sequence for each experiment, so you start every experiment with a pre-existing sequence of heads, having a 1 in 2 chance of hitting the target sequence on the first try of each new experiment.

    Initializing toss_sequence inside the outer loop will solve your problem:

    import random
    # n is  the target number of heads in a row
    # Change the value of n, for different target heads-sequences
    n = 4
    possible_tosses = [ 'h', 't' ]
    num_trials = 1000
    target_seq = ['h' for i in range(0,n)]
    seq_lengths_rec = []
    
    for trial_num in range(0,num_trials):
    
        if (trial_num % 100) == 0:
            print('Trial num {} out of {}'.format(trial_num, num_trials))
            # (The free version of trinket.io uses Python2)
    
        target_reached = 0
        toss_num = 0
        toss_sequence = []
    
        while target_reached == 0:
    
            toss_num += 1
            random.shuffle(possible_tosses)
            this_toss = possible_tosses[0]
            #print([toss_num, this_toss])
            toss_sequence.append(this_toss)
            last_n_tosses = toss_sequence[-n:]
            #print(last_n_tosses)
            if last_n_tosses == target_seq:
                #print('Reached target at toss', toss_num)
                target_reached = 1
                seq_lengths_rec.append(toss_num)
    
    print(sum(seq_lengths_rec) / len(seq_lengths_rec))
    

    You can simplify your code a bit, and make it less error-prone:

    import random
    # n is  the target number of heads in a row
    # Change the value of n, for different target heads-sequences
    n = 3
    possible_tosses = [ 'h', 't' ]
    num_trials = 1000
    seq_lengths_rec = []
    
    for trial_num in range(0, num_trials):
    
        if (trial_num % 100) == 0:
            print('Trial num {} out of {}'.format(trial_num, num_trials))
            # (The free version of trinket.io uses Python2)
    
        heads_counter = 0
        toss_counter = 0
    
        while heads_counter < n:
            toss_counter += 1
    
            this_toss = random.choice(possible_tosses)
    
            if this_toss == 'h':
                heads_counter += 1
            else:
                heads_counter = 0
        seq_lengths_rec.append(toss_counter)
    
    
    print(sum(seq_lengths_rec) / len(seq_lengths_rec))