Search code examples
rknn

knn- same k, different result


I have a matriz ZZ. After I ran prcomp and chose the first 5 PCs I get data_new:

P= prcomp(zz)
data_new = P$x[,1:5]

then I split into training set and test set

pca_train = data_new[1:121,]
pca_test = data_new[122:151,]

and use KNN:

k <- knn(pca_train, pca_test, tempGenre_train[,1], k = 5)
a <- data.frame(k)
res <- length(which(a!=tempGenre_test))

Each time I run these 3 last rows, I get a different value in res. Why?

Is there a better way to check what is the test error?


Solution

  • From the documentation of knn,

    For each row of the test set, the k nearest (in Euclidean distance) training set vectors are found, and the classification is decided by majority vote, with ties broken at random.

    If you don't want the randomization to occur, you could use set.seed to ensure the same "randomization" on each run.