I often need cross tables for pre-analysis of my data. I can produce a basic cross table with pd.crosstab(df['column'], df['column'])
but fail to add a crition (logical expression), to filter this cross table only to a subset of my dataframe.
I've tried pd.crosstab(df['health'], df['money']) if df['year']==1988
and several postions for the if. I hope it's easy to solve, but I'm relatively new to Python and Pandas.
import pandas as pd
df = pd.DataFrame({'year': ['1988', '1988', '1988', '1988', '1989', '1989', '1989', '1989'],
'health': ['2', '2', '3', '1', '3', '5', '2', '1'],
'money': ['5', '7', '8', '8', '3', '3', '7', '8']}).astype(int)
# cross table for 1988 and 1999
pd.crosstab(df['health'], df['money'])
Filter by boolean indexing
before crosstab
:
df1 = df[df['year']==1988]
df2 = pd.crosstab(df1['health'], df1['money'])
EDIT: You can filter each column separately:
mask = df['year']==1988
df2 = pd.crosstab(df.loc[mask, 'health'], df.loc[mask, 'money'])