Search code examples
sqlpandasdataframetransposeone-hot-encoding

python pandas dataframe values to 0 1


Name yr
Dave 2019
Dave 2020
stu 2019
stu 2021

and I want to result like

Name 2019_yr 2020_yr 2021_yr
Dave. 1. 1. 0
stu. 1. 0. 1

I tried to do use get_dummies but the result is like

Name yr_2018 yr_2019 yr_2021 yr_2022
Dave 0 1 0 0
Dave 1 0 0 0
stu 0 0 0 1
stu 0 0 1 0

Solution

  • If need 0, 1 ouput is necessary aggregate max:

    #Name is column
    df = pd.get_dummies(df.set_index('Name')).groupby('Name').max()
    

    #Name is index
    df = pd.get_dummies(df).groupby('Name').max()
    

    Or:

    df = pd.crosstab(df['Name'], df['yr']).add_suffix('_yr').clip(upper=1)