Search code examples
rclassificationdecision-treerpartprecision-recall

compute precision, recall, F1 values for rpart package result


The below data frame is a sample of my total data set:

A    B   C    D    E    target
0.2 0.5 0.6 -0.5   -0.7   1
0.9 0.7 0.4 -0.3   -0.8   0  
0.1 0.3 0.5 -0.9   -0.2   0
0.2 0.5 0.6 -0.5   -0.6   1

I want to apply classification tree on that so I used the below code:

data$target<-factor(data$target)

# Create Training Data
train.ind <- sample(nrow(data), 0.7*nrow(data))
trainData<-data[train.ind,]
testData<-data[-train.ind,]    

library("rpart")
tree <- rpart(target ~.,data=trainData)

ypred=predict(tree,testData)

    library(caret)
    #Print a confusion matrix
    result <- confusionMatrix(ypred,testData$target)

Error: data and reference should be factors with the same levels.

   precision <- result$byClass['Pos Pred Value']    
    recall <- result$byClass['Sensitivity']
    f_measure <- 2 * ((precision * recall) / (precision + recall))
    #OR
    f_measure <-result$byClass['F1']

But it did not work. I need to the precision, recall and F1 values but I don't know how to compute using "rpart" package result.


Solution

  • The function confusionMatrix takes two arguments (data and reference) which have to be factors with the same levels. This is not the case with your code because the predict function, when given an rpart-object, returns a matrix with probabilities of class memberships for each sample by default. You need to specify to this function that you want a vector of predicted classes and convert this vector to a factor with the same levels as target (0 and 1).

    This should do the trick:

    ypred <- factor(predict(tree, testData[, -6], type="vector"),
                    levels = levels(testData$target))
    library(caret)
    confusionMatrix(ypred, testData$target)
    

    Using factor(..., levels = levels(testData$target)) ensures that the levels are in the same order in both factors to avoid following warning:

    Warning message: In confusionMatrix.default(ypred, testData$target) : Levels are not in the same order for reference and data. Refactoring data to match.