Search code examples
excelprobabilityintersection

Counting data in Excel to calculate probabilities of intersections


I have the following problem and hope someone can give me a hint: I have an Excel sheet with three columns. In the first column I have a country code, in the second column I have a sector code (~50 sector codes per country and more than 30 countries). The third column includes a 0/1-Dummy. I would like to know the probability that the Dummy is one for sector 1 AND sector 2 (intersection). For that I need to know how often a 1 occurs in sector 1 and in sector 2.

The final output should be a conditional probability, and I think calculating it with the well know formula P(A|B)=P(intersection A and B)/P(B) is the easiest way - however, if there are easier ways to calculate the conditional probability, I would be very grateful as well.

In a simplified version the problem looks as follows, where I would like to know the probability that a AND b are 1:

screenshot of simplified table

Thanks in advance!


Solution

  • Just to get things started, I suggest you pivot the data first, then divide the number of rows with a=1 and b=1 by the number of rows (countries) in the table using

    =COUNTIFS(G3:G5,1,H3:H5,1)/COUNT(G3:G5)
    

    enter image description here