I want to create a contingency table in Pandas. I can do it with the following code but I wondered if there is a pandas function that would do it for me.
For a reproducible example:
toy_data #json
'{"Light":{"321":"no_light","476":"night_light","342":"lamp","454":"lamp","25":"night_light","53":"night_light","120":"night_light","346":"night_light","360":"lamp","55":"no_light","391":"night_light","243":"no_light","101":"night_light","377":"night_light","124":"no_light","368":"lamp","400":"no_light","247":"night_light","270":"lamp","208":"night_light"},"Nearsightedness":{"321":"No","476":"Yes","342":"Yes","454":"Yes","25":"No","53":"Yes","120":"Yes","346":"No","360":"No","55":"Yes","391":"Yes","243":"No","101":"No","377":"Yes","124":"No","368":"No","400":"No","247":"No","270":"Yes","208":"No"}}'
toy_data.head()
Light Nearsightedness
321 no_light No
476 night_light Yes
342 lamp Yes
454 lamp Yes
25 night_light No
df = pd.DataFrame(toy_data.groupby(['Light', 'Nearsightedness']).size())
df = df.unstack('Nearsightedness')
df.columns = df.columns.droplevel()
df
Nearsightedness No Yes
Light
lamp 2 3
night_light 5 5
no_light 4 1
pd.crosstab will do the trick:
pd.crosstab(df.Light, df.Nearsightedness)
Output:
Nearsightedness No Yes
Light
lamp 2 3
night_light 5 5
no_light 4 1