I would like stack 2 crosstabs for 2 categorical variables (gender, race) while preserving the same binary variable (goal) as columns.
Take for example this dataset:
df = pd.DataFrame({'GenderDSC':['Male','Female','Female','Male','Male','Male'],
'Goal':[0,1,0,1,0,0],
'Race':['African-American','White','White','Asian','Asian','White']})
df
Where crosstabs would output:
But would like them to be merged like:
So far, I have tried:
a = pd.crosstab(df['GenderDSC'],df['Goal'])
b = pd.crosstab(df['Race'],df['Goal'])
pd.concat([a,b])
But I lose the groups for each columns (genderdsc, race), so thinking there could be a better way to connect them.
Use the argument keys
from pd.concat
>>> pd.concat([a,b], keys=('GenderDSC', 'Race'))
From the docs,
If multiple levels passed, should contain tuples. Construct hierarchical index using the passed keys as the outermost level.
Goal 0 1
GenderDSC Female 1 1
Male 3 1
Race African-American 1 0
Asian 1 1
White 2 1