I have a df
as follows:
appid month tag totalvalue
0 1234 02-'22 B 50.00
1 1234 02-'22 BA 10.00
2 1234 01-'22 B 100.00
3 2345 03-'22 BA 25.00
4 2345 03-'22 B 100.00
5 2345 04-'22 BB 100.00
Output what I want is follows:
appid month tag totalvalue %tage
0 1234 02-'22 B 50.00 1.0
1 1234 02-'22 BA 10.00 0.2
2 1234 01-'22 B 100.00 1.0
3 2345 03-'22 BA 25.00 0.25
4 2345 03-'22 B 100.00 1.0
5 2345 04-'22 BB 100.00 inf
I want to have group variables based on appid
& month
. Moreover want to check if there are tag=B
is available in that group
just divide other tag
's totalvalue
with it. If not shows the inf
I have tried with df.groupby(['appid', 'month'])['totalvalue']
but unable to replicate them with condition of tag=B
as denominator over groupby
object.
IIUC, you can use a groupby.transform('first')
on the masked totalvalue, then use it a divider:
m = df['tag'].eq('B')
df['%tage'] = (df['totalvalue']
.div(df['totalvalue'].where(m)
.groupby([df['appid'], df['month']])
.transform('first').fillna(0))
)
output:
appid month tag totalvalue %tage
0 1234 02-'22 B 50.0 1.00
1 1234 02-'22 BA 10.0 0.20
2 1234 01-'22 B 100.0 1.00
3 2345 03-'22 BA 25.0 0.25
4 2345 03-'22 B 100.0 1.00
5 2345 04-'22 BB 100.0 inf