I have a bucket of tennis balls(2) and baseballs(22) for a total of 24 balls in the bin.
I want to know what the probability is for 3 scenarios.
Each time I am going to pull out a total of 12 balls at random.
I want to know the probability after pulling out all 12 balls whats the likelihood:
1.) I pull out both(2) tennis balls 2.) I pull out 0 tennis balls 3.) I only pull 1 tennis ball?
Obviously the probabilities for all 3 of these questions have to add up to 1 or 100%
thank you
It's a hypergeometric distribution when you sample without replacement. So let's if we use hypergeom from scipy
in python:
from scipy.stats import hypergeom
import seaborn as sns
# M is total pool, n is number of successes, N is the number of draws
[M, n, N] = [22, 2, 12]
rv = hypergeom(M, n, N)
#the range of values we are interested in
x = np.arange(0, n+1)
pmf_tballs = rv.pmf(x)
the probabilities for 0,1,2
pmf_tballs
array([0.19480519, 0.51948052, 0.28571429])
sns.barplot(x=x,y=pmf_tballs,color="b")
You can calculate by brute force:
import itertools
balls = [int(i) for i in '1'*2 + '0'*20]
draws = itertools.combinations(balls, 12)
counts_by_tballs = {0:0,1:0,2:0}
for i in draws:
counts[sum(i)] +=1
You get a tally of 0, 1 and 2:
counts
{0: 251940, 1: 671840, 2: 369512}
And the probability is the same as above with hypergeometric:
[i/sum(counts.values()) for i in counts.values()]
[0.19480519480519481, 0.5194805194805194, 0.2857142857142857]