Search code examples
pythonruntime-errorsimulationprobabilityfactorial

How to use python to simulate probability problems with factorial


I want to simulate a probalility problem, But the code comes up RunTimeError

There are k people in a room. Assume each person’s birthday is equally likely to be any of the 365 days of the year (we exclude February 29), and that people’s birthdays are independent (we assume there are no twins in the room). What is the probability that two or more people in the group have the same birthday?

import numpy as np
import matplotlib.pyplot as plt
from scipy.special import factorial

N = 365
k = np.arange(1,41)

def fun(k):
    return 1 - factorial(N)/N**k/factorial(N-k)

plt.plot(k, fun(k))

I want to simulate the probablity problem sucessfully


Solution

  • From Calculating the Birthday Paradox in SciPy we are warned that moderately large values of N and k will cause the formula to overflow if implemented directly, thus causing a run time error for your function fun. The trick is to use logarithms to avoid overflow or underflow.

    SciPy doesn’t have a log factorial function, but does have a log gamma function, so we use that instead.

    Modified Code

    import numpy as np
    import matplotlib.pyplot as plt
    
    from numpy.lib.scimath import log
    from numpy import exp
    from scipy.special import gammaln
    
    def fun(k):
      # Use log of gamma for log of factorial 
      # then inverse log (i.e. exponentiate)
      return 1 - exp( gammaln(N+1) - gammaln(N-k+1) - k*log(N) )
      # replaces 1 - factorial(N)/N**k/factorial(N-k)
      
    N = 365
    k = np.arange(1, 41)
    
    # Display
    plt.plot(k, fun(k))
    plt.show()
    

    Result enter image description here