Here shape of df is (190,2) where 1st column is x and is a categorical value and @nd column is Integer.
X = df.iloc[:,0].values
y = df.iloc[:,-1].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X = labelencoder.fit_transform(X)
X.reshape(-1,1)
onehotencoder = OneHotEncoder(categories = [0])
X = onehotencoder.fit_transform(X).toarray()
Here I wanted to change the Categorical value X using OneHotEncoder to predict y. But When I run this code, I am getting an error.
ValueError: bad input shape ()
Can someone help me to resolve this issue. Thanks
Currently OneHotEncoder
does not require for the input features to be numerical. So you can just feed it directly the categorical features:
onehotencoder = OneHotEncoder()
X_oh = onehotencoder.fit_transform(X).toarray()
In the case of having a 1D
array, as is usually the case of y
, you'll need to reshape the array into a 2D
one:
onehotencoder = OneHotEncoder()
X_oh = onehotencoder.fit_transform(X.reshape(-1,1)).toarray()
Do note however that the following:
X.reshape(-1,1)
Is not doing anything. It is not performing an in-place operation. You have to assign it back to a variable.