Search code examples
pythonnumpystochasticstochastic-process

Systematic error in Python stochastic simulation


I want to simulate a simple birth death process using the Gillespie algorithm (https://en.wikipedia.org/wiki/Gillespie_algorithm) in python.

At each instant, there is a probability a of birth and a probability b of death per indiviudal. I believe the following code should simulate the dynamics of the population n:

import numpy as np 
a = 10.0 # birth rate
b = 1.0 # death rate
n = 0.0 # initial population
t = 0.0 # initial time
T = [] # times of births or deaths
N = [] # population timeseries
for _ in range(1000000): # do 1000000 iterations
    P = [a, b*n] # rate of birth and death
    P0 = sum(P) # total rate
    # step the time
    t = t + (1/P0)*np.log(1/np.random.random())
    # select the transition
    R = P[0]/P0 # probability of birth
    r = np.random.random() # choose a random number
    # enact the transition
    if r<R: # birth
        n = n+1
    elif r>=R: # death 
        n = n-1
    N.append(n) # update the output lists
    T.append(t)

This has the intended behavior: plotting N versus T shows a random population with well-defined statistical patterns. However, I have a serious source of confusion.

The analytical solution of this model says the mean value of N should be a/b, while this simulation consistently overshoots -- this error is systematic and is preserved for any choice of a and b. Increasing the number of iterations does not reduce this systematic error. I compute it as

(sum(N)/len(N)-a/b)/(a/b)*100 # error in the mean value in percent

which always returns at least 10%.

What am I missing here? What is the source of this systematic error in the mean of my simulation? Am I misinterpreting np.random somehow? There has to be an issue with my code as the error should scale as 1/sqrt(# iterations), otherwise.


Solution

  • Do you need to take time into account? I’m not really familiar with this stuff but this simplified version at least gives results that get closer to zero with time.

    import random
    
    TIME = 100000.0
    
    a = 10.0  # birth rate
    b = 1.0   # death rate
    n = 0     # initial population
    t = 0.0   # initial time
    
    t_average_pop = 0.0
    
    while True:
        P0 = a + b * n
    
        # step the time
        step = random.expovariate(P0)
        t += step
        t_average_pop += n * step
    
        if t >= TIME:
            break
    
        # select the transition
        if random.uniform(0, P0) < a:
            # birth
            n += 1
        else:
            # death
            n -= 1
    
    average_pop = t_average_pop / t
    expected = a / b
    
    print((average_pop - expected) / expected * 100)