Input
cust_Id category product purchased
1 Elec light 0
1 Elec light 1
1 Elec light 0
1 HA Table 1
1 HH Pen 1
2 Elec light 0
2 HA Table 1
3 HH Pen 0
3 Elec light 1
I want to know the best customer,category,product based on maximum probability value
Try this:
grp = df.groupby(['cust_Id', 'category', 'product'])
prob = grp.sum() / grp.count()
Result is the probability that a particular combination of the 3 attributes will purchase something:
purchased
cust_Id category product
1 Elec light 0.333333
HA Table 1.000000
HH Pen 1.000000
2 Elec light 0.000000
HA Table 1.000000
3 Elec light 1.000000
HH Pen 0.000000
The probability of them not purchase anything is simply the complement of that (i.e. 1 - prob
)