I have the following pandas pivot table:
table = pd.pivot_table(df, index = 'group', columns = 'levels',
values = 'uid', aggfunc = 'count', margins = True)
There are three levels (level 1, level 2, and level 3).
Basically I have individuals that progress through these levels, but I want to know what percentage of people make it from one level to the next.
I'm thinking I need two transformations:
Any advice is much appreciated. Thank you!
Initial table:
Groups | Level 1 | Level 2 | Level 3 |
---|---|---|---|
GroupA | 5 | 6 | 3 |
GroupB | 3 | 2 | 4 |
Want to end up with table:
Groups | Level 1 | Level 2 | Level 3 |
---|---|---|---|
GroupA | 100% | 64% | 33% |
GroupB | 100% | 67% | 67% |
According to your logic, you can just take the cumsum in reverse order by the rows, then divide:
d = df.iloc[:,::-1].cumsum(axis=1).iloc[:,::-1]
d.div(d.shift(axis=1)).fillna(1)
output:
Level 1 Level 2 Level 3
Groups
GroupA 1.0 0.642857 0.333333
GroupB 1.0 0.666667 0.666667