ValueError: could not convert string to float: 'what' (Sklearn), How to use the labelencoder?

I have two training sets input and output set

X = df['First Word']

y = df['Answers']

When I tried:

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
model.fit(X,y)
predictions = model.predict(['how'])

I got the error:

ValueError: could not convert string to float: 'what'

The error refers to that str() cannot be passed to the fit() method.

How to use the LabelEncoder in this case so that the above code works?

Solution

All ML models need input in the form of numbers so you need to encode the input data either label encoder or one-hot encoding as per your need.

you can encode your dataframe using the below code

 from sklearn import preprocessing
 le = preprocessing.LabelEncoder()
 X = le.fit_transform(X)

After encoding pass to model, I hope you won't get that error