I am performing text classification, I have created features and I have multiple labels to train and predict which are basically the binary variables, which I want to predict.
Here is my code, and log of the error.
for (col in colnames(train_data)){
train_label <- train_data[,c(col)]
test_pred <- knn(train = train_mat[ ,!(colnames(train_mat) == "Sentiment")], test = test_mat[ ,!(colnames(test_mat) == "Sentiment")], cl = as.factor(train_label), k=6)
table(test_pred,test_data[, col])
acc.RF = mean(test_pred==test_data[, col])
acc.RF
confusionMatrix(table(test_pred,test_data[, col]))
}
Error in knn(train = train_mat[, !(colnames(train_mat) == "Sentiment")], :
'train' and 'class' have different lengths
I am getting the following error.
Sentiment is a variable, which is main to predict, but i want to train with all variables present in the train/test original df.
Please, in train/test_mat, I have appended Sentiment column, so I am excluding it while feeding features to KNN.
Consider Map
, the wrapper to mapply
and build a list of confusion matrices passing each column from test and train data elementwise. Also, consider transform
in removing Sentiment:
matrix_process <- function(test_label, train_label) {
test_pred <- knn(train = transform(train_mat, Sentiment = NULL),
test = transform(test_mat, Sentiment = NULL),
cl = as.factor(train_label), k=6)
print(table(test_pred, test_label))
acc.RF = mean(test_pred == test_label)
print(acc.RF)
return(confusionMatrix(table(test_pred, test_label)))
}
conf_matrix_list <- Map(matrix_process, test_data, train_data)
# EQUIVALENTLY:
conf_matrix_list <- mapply(matrix_process, test_data, train_data, SIMPLIFY=FALSE)