I want to simulate a probalility problem, But the code comes up RunTimeError
There are k people in a room. Assume each person’s birthday is equally likely to be any of the 365 days of the year (we exclude February 29), and that people’s birthdays are independent (we assume there are no twins in the room). What is the probability that two or more people in the group have the same birthday?
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import factorial
N = 365
k = np.arange(1,41)
def fun(k):
return 1 - factorial(N)/N**k/factorial(N-k)
plt.plot(k, fun(k))
I want to simulate the probablity problem sucessfully
From Calculating the Birthday Paradox in SciPy we are warned that moderately large values of N and k will cause the formula to overflow if implemented directly, thus causing a run time error for your function fun. The trick is to use logarithms to avoid overflow or underflow.
SciPy doesn’t have a log factorial function, but does have a log gamma function, so we use that instead.
Modified Code
import numpy as np
import matplotlib.pyplot as plt
from numpy.lib.scimath import log
from numpy import exp
from scipy.special import gammaln
def fun(k):
# Use log of gamma for log of factorial
# then inverse log (i.e. exponentiate)
return 1 - exp( gammaln(N+1) - gammaln(N-k+1) - k*log(N) )
# replaces 1 - factorial(N)/N**k/factorial(N-k)
N = 365
k = np.arange(1, 41)
# Display
plt.plot(k, fun(k))
plt.show()