Search code examples
rr-caretconfusion-matrixrpart

Error in confusion matrix data can't have more levels CARET


dataset can be find here : https://archive.ics.uci.edu/ml/datasets/Bank+Marketing#

set.seed(1234)
ind <- sample(2, nrow(bank), replace = TRUE, prob = c(0.7, 0.3))
train.data <- bank[ind == 1, ]
test.data <- bank[ind == 2, ]

I was searching my problem i tried to set it to factor in confusion matrix. But problem didn't solved at all

cartmodel <- rpart(y ~., data = train.data)
cartmodel
cart.pred = predict(cartmodel, test.data)
summary(cart.pred)
confusionMatrix(as.factor(cart.pred),as.factor(test.data$y))
confusionMatrix

What i need to change? Dataset is Bank.. so Num and Factor attributes.

Update : i tried change all atributed to factor.. still error


Solution

  • Using the csv from UCI (can also try this link):

    library(rpart)
    library(caret)
    bank = read.csv("../bank/bank-full.csv",sep=";")
    
    set.seed(1234)
    ind <- sample(2, nrow(bank), replace = TRUE, prob = c(0.7, 0.3))
    train.data <- bank[ind == 1, ]
    test.data <- bank[ind == 2, ]
    

    When you call predict() you are getting the probabilities, not the labels:

    cartmodel <- rpart(y ~., data = train.data)
    cart.pred = predict(cartmodel, test.data)
    head(cart.pred)
    
              no        yes
    5  0.9393461 0.06065387
    14 0.9393461 0.06065387
    16 0.9393461 0.06065387
    26 0.9393461 0.06065387
    28 0.9393461 0.06065387
    29 0.9393461 0.06065387
    

    To get labels:

    cart.pred = predict(cartmodel, test.data,type="class")
    confusionMatrix(cart.pred,test.data$y)
    
    Confusion Matrix and Statistics
    
              Reference
    Prediction    no   yes
           no  11710  1039
           yes   302   517
                                             
                   Accuracy : 0.9012         
                     95% CI : (0.896, 0.9061)
        No Information Rate : 0.8853         
        P-Value [Acc > NIR] : 1.831e-09      
                                             
                      Kappa : 0.3869