I want to compute binomial probabilities on python. I tried to apply the formula:
probability = scipy.misc.comb(n,k)*(p**k)*((1-p)**(n-k))
Some of the probabilities I get are infinite. I checked some values for which p=inf. For one of them, n=450,000 and k=17. This value must be greater than 1e302 which is the maximum value handled by floats.
I then tried to use sum(np.random.binomial(n,p,numberOfTrials)==valueOfInterest)/numberOfTrials
This draws numberOfTrials samples and computes the average number of times the value valueOfInterest is drawn.
This doesn't raise any infinite value. However, is this a valid way to proceed? And why this way wouldn't raise any infinite value whereas computing the probabilities does?
Work in the log domain to compute combination and exponentiation functions and then raise them to exponent.
Something like this:
combination_num = range(k+1, n+1)
combination_den = range(1, n-k+1)
combination_log = np.log(combination_num).sum() - np.log(combination_den).sum()
p_k_log = k * np.log(p)
neg_p_K_log = (n - k) * np.log(1 - p)
p_log = combination_log + p_k_log + neg_p_K_log
probability = np.exp(p_log)
Gets rid of numeric underflow/overflow because of large numbers. On your example with n=450000
and p = 0.5, k = 17
, it returns p_log = -311728.4
, i. e., the log of final probability is pretty small and hence underflow occurs while taking np.exp
. However, you can still work with log probability.