Search code examples
statisticsprobability

How to find the probability of 3 scenarios


I have a bucket of tennis balls(2) and baseballs(22) for a total of 24 balls in the bin.

I want to know what the probability is for 3 scenarios.

Each time I am going to pull out a total of 12 balls at random.

I want to know the probability after pulling out all 12 balls whats the likelihood:

1.) I pull out both(2) tennis balls 2.) I pull out 0 tennis balls 3.) I only pull 1 tennis ball?

Obviously the probabilities for all 3 of these questions have to add up to 1 or 100%

thank you


Solution

  • It's a hypergeometric distribution when you sample without replacement. So let's if we use hypergeom from scipy in python:

    from scipy.stats import hypergeom
    import seaborn as sns
    # M is total pool, n is number of successes, N is the number of draws
    [M, n, N] = [22, 2, 12]
    rv = hypergeom(M, n, N)
    #the range of values we are interested in
    x = np.arange(0, n+1)
    pmf_tballs = rv.pmf(x)
    

    the probabilities for 0,1,2

    pmf_tballs
    array([0.19480519, 0.51948052, 0.28571429])
    
    sns.barplot(x=x,y=pmf_tballs,color="b")
    

    enter image description here

    You can calculate by brute force:

    import itertools
    balls = [int(i) for i in '1'*2 + '0'*20]
    draws = itertools.combinations(balls, 12)
    counts_by_tballs = {0:0,1:0,2:0}
    for i in draws:
        counts[sum(i)] +=1
    
    You get a tally of 0, 1 and 2:
    
    counts
    {0: 251940, 1: 671840, 2: 369512}
    

    And the probability is the same as above with hypergeometric:

    [i/sum(counts.values()) for i in counts.values()]
    [0.19480519480519481, 0.5194805194805194, 0.2857142857142857]