Search code examples
python-3.xmachine-learningscikit-learnone-hot-encoding

How to pass test data to obtain model predictions if onehotencoder applied to train data


I am using Sklearn.preprocessing to preprocees (onehotencoder) the categorical data.

onehotencoder = OneHotEncoder()
pre_loc_data1 = onehotencoder.fit_transform(pre_loc_data1.astype(str)).toarray()
print(pre_loc_data1)

X_train, X_test, y_train, y_test = train_test_split(pre_loc_data1, pre_loc_target, test_size=0.2)

here X-train is now encoded data. if i give the y_train data to model for prediction it's working fine. because it is also encoded data. but i want to use individual record as input to model to predict without encoding like below

(clf.predict(['Hyderabad / Secunderabad','0 Year(s) 8 Month(s)','android','java']))

how to give such type of data as input to model to test.

Thanks in advance!


Solution

  • You need to apply onehotencoder to the input (assuming clf is your trained model):

    clf.predict(onehotencoder.transform([['Hyderabad / Secunderabad','0 Year(s) 8 Month(s)','android','java']]))