I am new to data mining I was trying to implement the KNN Classifier on separate training and testing datasets. all tutorials that I see use train_test_split method to split the data set, whereas I already have the dataset split into Train and Test. How do I assign the target variable?
I am assuming that your test
data is labelled (i.e. logically divided into test_X and test_y, and you would use this to test the performance of your model which you have trained on train
data.
Load train data into (train_X, train_y) and load test data into (test_X, test_y)
Train your model with train data
from sklearn.neighbors import KNeighborsClassifier
knn_clf = KNeighborsClassifier()
knn_clf.fit(train_X, train_y)
y_pred = model.predict(test_X)
import numpy as np
accuracy = np.mean(test_y == y_pred)