python-3.x machine-learning scikit-learn one-hot-encoding

How to pass test data to obtain model predictions if onehotencoder applied to train data

I am using Sklearn.preprocessing to preprocees (onehotencoder) the categorical data.

onehotencoder = OneHotEncoder()
pre_loc_data1 = onehotencoder.fit_transform(pre_loc_data1.astype(str)).toarray()
print(pre_loc_data1)

X_train, X_test, y_train, y_test = train_test_split(pre_loc_data1, pre_loc_target, test_size=0.2)

here X-train is now encoded data. if i give the y_train data to model for prediction it's working fine. because it is also encoded data. but i want to use individual record as input to model to predict without encoding like below

(clf.predict(['Hyderabad / Secunderabad','0 Year(s) 8 Month(s)','android','java']))

how to give such type of data as input to model to test.

Thanks in advance!

Solution

You need to apply onehotencoder to the input (assuming clf is your trained model):

clf.predict(onehotencoder.transform([['Hyderabad / Secunderabad','0 Year(s) 8 Month(s)','android','java']]))