Search code examples
pythonpandasmachine-learningscikit-learnsklearn-pandas

TypeError While trying to label encode the User Input data


I have encoded my labels by using this code for train data in python 3

from sklearn.preprocessing import LabelEncoder
le = preprocessing.LabelEncoder()                      
dframe["hair"] = le.fit_transform(dframe["hair"])          
dframe["beard"] = le.fit_transform(dframe["beard"])         
dframe["scarf"] = le.fit_transform(dframe["scarf"])  

enter image description here

After training my model I want to test it by using the input from the user enter image description here

I'm trying to encode the user input by using this code:

user_input["hair"] = le.transform(user_input["hair"])
user_input["beard"] = le.transform(user_input["beard"])
user_input["scarf"] = le.transform(user_input["scarf"])

But I'm receiving the following error:

enter image description here

TypeError: '<' not supported between instances of 'int' and 'str'

I've seen the multiple duplicates of this question on S/O but still couldn't find a solution. So instead of marking it a duplicate, kindly provide a helpful solution. I'm a Machine Learning Beginner so feel free to point out any mistake in this code, you can also ask for a full code.


Solution

  • The labelencoder saves the transformation from categorical input to numbers. However, when you fit it multiples times, it only saves the last one ('scarf'). When you then try to transform the user input for hair, the values it uses as input, do not match.

    The solution is to fit three label encoders:

    le_hair = preprocessing.LabelEncoder()
    le_beard = preprocessing.LabelEncoder()                      
    le_scarf = preprocessing.LabelEncoder()                                            
    dframe["hair"] = le_hair.fit_transform(dframe["hair"])          
    dframe["beard"] = le_beard.fit_transform(dframe["beard"])         
    dframe["scarf"] = le_scarf.fit_transform(dframe["scarf"])  
    

    and then use these ones respectively to transform the new input.