Search code examples
probabilityprobability-distribution

How to decide the probability percentage in question


I have the below question: enter image description here

In the first part of the question, is says the probability that the selected person will be a male is 0.44, it means the number of males is 25*0.44 = 11. That's ok

In the second part, the probability of the selected person will be a male who was born before 1960 is 0.28, Does that mean 0.28 out of the total number which is 25 or out of the number of males? I mean should the number of male who was born before 1960 equals into 250.28 OR 110.28


Solution

  • I find it easiest to think of these sorts of problems as contingency tables. You use a maxtrix layout to express the distributions in terms of two or more factors or characteristics, each having two or more categories. The table can be constructed either with probabilities (proportions) or with counts, and switching back and forth is easy based on the total count in the table. Entries in the table are the intersections of the categories, corresponding to and in a verbal description. The numbers to the right or at the bottom of the table are called marginals, because they're found in the margins of the tables, and are always the sum of the table row or column entries in which they occur. The total probability (or count) in the table is found by summing across all the rows and columns. The marginal distribution of gender would be found by summing across rows, and the marginal distribution of birthdays would be found by summing across the columns.

    Based on this, you can inferentially determine other values as indicated by the entries in parentheses below. With one more entry, either for gender or in the marginal row for birthdays, you'd be able to fill in the whole table inferentially. (This is related to the concept of degrees of freedom - how many pieces of info can you fill in independently before the others are determined by the known constraint that the totals are fixed or that probability adds to 1.)

    Probabilities
    
                Birthday
            < 1960 | >= 1960
       _______________________
    G    |         |          |
    e  F |         |          | (0.56)
    n  __|_________|__________|
    d    |         |          |
    e  M |   0.28  |  (0.16)  |  0.44
    r  __|_________|__________|______
              ?          ?    |  1.00
    
    
    Counts
    
                Birthday
            < 1960 | >= 1960
       _______________________
    G    |         |          |
    e  F |         |          | (14)
    n  __|_________|__________|
    d    |         |          |
    e  M |    7    |    (4)   |  11
    r  __|_________|__________|_____
              ?          ?    |  25
    

    Conditional probability corresponds to limiting yourself to the subset of rows or columns specified in the condition. If you had been asked what is the probability of a birthday < 1960 given the gender is male, i.e., P{birthday < 1960 | M} in relatively standard notation, you'd be restricting your focus to just the M row, so the answer would be 7/11 = 0.28/0.44. Computationally, you take the probabilities or counts in the qualifying table entries and express them as a proportion of the probabilities or counts of the specified (given) marginal entries. This is often written in prob & stats texts as P(A|B) = P(AB)/P(B), where AB is a set shorthand for A and B (intersection).