I'd like to display the confusion matrix after a train() of the caret library, but I have some doubts. The "train()" should be on a train set ?(I'm not sure because of the "control" parameter). The "predict()" on the test set ? It seems weird to predict on the whole data set...
# df_corpus = Document Term Matrix + 1 column of Cos.code(class which are 203.2.2, 204.3.2 ...)
dataset <- df_corpus
control <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
seed <- 7
metric <- "Accuracy"
preProcess=c("center", "scale")
# Linear Discriminant Analysis
set.seed(seed)
fit.lda <- train(Cos.code~., data=dataset, method="lda", metric=metric,preProc=c("center", "scale"), trControl=control)
ldaClasses <- predict(fit.lda)
cm <- confusionMatrix(data = ldaClasses, dataset$Cos.code)
F1_score(cm$table, "lda")
Thank you for your help
You can get the confusion matrix like this:
confusionMatrix(predict(fit.lda,dataset$Cos.code),dataset$Cos.code)
You can calculate the confusion matrix in the same manner for your testing set, just switch the datasets.
But I believe your model should contain already the information that you want Examine the information given when printing these two objects.
fit.lda
fit.lda$finalModel