Calculating multiple (2 or more variables) conditional probability in python

Let's take the following df as an example:

df = pd.DataFrame({'start_point':['Station_1', 'Station_2', 'Station_1', 'Station_3','Station_1', 'Station_1', 'Station_1', 'Station_3'], 'end_point':['Station_1', 'Station_2', 'Station_1', 'Station_2','Station_2', 'Station_3', 'Station_3', 'Station_1'], 'period_of_day':['morning', 'noon', 'morning', 'night','evening', 'night', 'morning', 'afternoon'], 'day_of_week':['0', '1', '2', '0','1', '0', '1', '0']})

I would like to calculate the conditional probability of a trip ending at an end_point

I manage to do this situation when I use only one condition. As an example, let's calculate the probability of a trip ending in a given end_point, taking into account the day_of_week

df.groupby('end_point')['day_of_week'].value_counts() / df.groupby('day_of_week')['end_point'].count()

end_point  day_of_week
Station_1  0              0.500000
           2              1.000000
Station_2  1              0.666667
           0              0.250000
Station_3  0              0.250000
           1              0.333333

However, I'm having a hard time calculating this probability when I involve two or more conditions. How can I add period_of_day also as a condition, for example?

Solution

Do to P(A|B) you should do:

df.groupby([B])[A].value_counts(normalize=True)

For example:

df.groupby('day_of_week')['end_point'].value_counts(normalize=True)

0            Station_1    0.500000
             Station_2    0.250000
             Station_3    0.250000
1            Station_2    0.666667
             Station_3    0.333333
2            Station_1    1.000000

And for more than one column:

df.groupby(['day_of_week', 'period_of_day'])['end_point'].value_counts(normalize=True)

day_of_week  period_of_day  end_point
0            afternoon      Station_1    1.0
             morning        Station_1    1.0
             night          Station_2    0.5
                            Station_3    0.5
1            evening        Station_2    1.0
             morning        Station_3    1.0
             noon           Station_2    1.0
2            morning        Station_1    1.0