Search code examples
machine-learningencodingmissing-datacategorical-datafeature-engineering

How to perform target guided encoding on a particular feature excluding 'nan' values?


from category_encoders import TargetEncoder
encoder=TargetEncoder()

for i in df['gender']:
df['gender']=np.where(df[i]!='nan',encoder.fit_transform(data['gender'],data['target']),'nan')
  • Unique values in gender column is: 'Male', 'Female', 'other' and 'nan'
  • And i wanna encode all the values except 'nan'
  • I tried the above code but it's giving me the following error:

{KeyError: 'Male'}

  • Please help me with if there is any other way to do that or how to get it correctly

Solution

  • After a lot of Google search, I found out that there is already an in-built method. Try this:

    from category_encoders import TargetEncoder
    
    encoder = TargetEncoder(handle_missing = 'return_nan')
    df['gender'] = encoder.fit_transform(df['gender'], df['target'])