I have following boolean dataframe in pandas:
Cat Dog Mouse
Alex 1 0 1
Lola 0 0 1
Bob 1 1 1
Each cell contains true/false saying whether someone has animal or not. I would like to get dataframe which contains conditional probability of each pair of animals where rows dictate condition.
Cat Dog Mouse
Cat 1 50% 1
Dog 1 1 1
Mouse 66% 33% 1
Is there fast way of doing this in pandas? If yes, then how?
You can use a dot product between the df and the transposed df and calculate the rank as percentage:
df.T.dot(df).rank(axis=1,method='dense',pct=True).round(3)
Cat Dog Mouse
Cat 1.000 0.500 1.0
Dog 1.000 1.000 1.0
Mouse 0.667 0.333 1.0