I want to calculate the % according to the total of the entire table, not just row or column in a pandas table with python. I can do by row and by column separately, but i rather to do the entire table. Any suggestions are welcome. Thanks so much
data
df1 = pd.DataFrame(data={'id': [1,2,3,4,5,6,7,8,9,10], 'place': [1,1,2,2,2,1,1,2,1,1], 'band': [1,2,3,3,3,2,1,2,3,1], 'status': [1,2,2,1,1,1,1,2,1,2]})
d1={1: 'north', 2: 'south'}
d2={1: '10-20', 2: '30-40', 3: '20-30'}
d3={1: 'green', 2: 'red'}
df1['place']=df1['place'].map(d1).fillna('Other')
df1['band']=df1['band'].map(d2).fillna('Other')
df1['status']=df1['status'].map(d3).fillna('Other')
commands per row/col
pd.crosstab(df1.band, [df1.place, df1.status]).apply(lambda r: r/r.sum(), axis=1).round(2) # by row
pd.crosstab(df1.band, [df1.place, df1.status]).apply(lambda r: r/r.sum(), axis=0).round(2) # by column
If you do not have to do it in a single apply
, you can just divide your cross-table with the total sum.
ctdf = pd.crosstab(df1.band, [df1.place, df1.status])
ctdf / ctdf.sum().sum()
>>>
place north south
status green red green red
band
10-20 0.2 0.1 0.0 0.0
20-30 0.1 0.0 0.2 0.1
30-40 0.1 0.1 0.0 0.1
Another option (by @Shubham Sharma's comment) is to use ctdf.values.sum()
Finally, you can also use pipe
(instead of apply
) in a single method chain:
pd.crosstab(df1.band, [df1.place, df1.status]).pipe(lambda x: x / x.values.sum())