Search code examples
rmachine-learningr-caret

R caret / Confusion matrix


I'd like to display the confusion matrix after a train() of the caret library, but I have some doubts. The "train()" should be on a train set ?(I'm not sure because of the "control" parameter). The "predict()" on the test set ? It seems weird to predict on the whole data set...

# df_corpus = Document Term Matrix + 1 column of Cos.code(class which are 203.2.2, 204.3.2 ...)
dataset <- df_corpus
control <- trainControl(method = "repeatedcv", number = 10, repeats = 3)
seed <- 7

metric <- "Accuracy"
preProcess=c("center", "scale")

# Linear Discriminant Analysis
set.seed(seed)
fit.lda <- train(Cos.code~., data=dataset, method="lda", metric=metric,preProc=c("center", "scale"), trControl=control)
ldaClasses <- predict(fit.lda)
cm <- confusionMatrix(data = ldaClasses, dataset$Cos.code)
F1_score(cm$table, "lda")

Thank you for your help


Solution

  • You can get the confusion matrix like this:

    confusionMatrix(predict(fit.lda,dataset$Cos.code),dataset$Cos.code)
    

    You can calculate the confusion matrix in the same manner for your testing set, just switch the datasets.

    But I believe your model should contain already the information that you want Examine the information given when printing these two objects.

    fit.lda
    
    fit.lda$finalModel