Search code examples
pythonpython-3.xpandasdataframecontingency

Creating a Contingency table in Pandas


I want to create a contingency table in Pandas. I can do it with the following code but I wondered if there is a pandas function that would do it for me.

For a reproducible example:

toy_data #json
'{"Light":{"321":"no_light","476":"night_light","342":"lamp","454":"lamp","25":"night_light","53":"night_light","120":"night_light","346":"night_light","360":"lamp","55":"no_light","391":"night_light","243":"no_light","101":"night_light","377":"night_light","124":"no_light","368":"lamp","400":"no_light","247":"night_light","270":"lamp","208":"night_light"},"Nearsightedness":{"321":"No","476":"Yes","342":"Yes","454":"Yes","25":"No","53":"Yes","120":"Yes","346":"No","360":"No","55":"Yes","391":"Yes","243":"No","101":"No","377":"Yes","124":"No","368":"No","400":"No","247":"No","270":"Yes","208":"No"}}'

toy_data.head()
    Light       Nearsightedness
321 no_light       No
476 night_light    Yes
342 lamp           Yes
454 lamp           Yes
25  night_light    No

df = pd.DataFrame(toy_data.groupby(['Light', 'Nearsightedness']).size())

df = df.unstack('Nearsightedness')

df.columns = df.columns.droplevel()

df
Nearsightedness No  Yes
Light       
lamp             2  3
night_light      5  5
no_light         4  1

Solution

  • pd.crosstab will do the trick:

    pd.crosstab(df.Light, df.Nearsightedness)
    

    Output:

    Nearsightedness  No  Yes
    Light
    lamp              2    3
    night_light       5    5
    no_light          4    1