The below data frame is a sample of my total data set:
A B C D E target
0.2 0.5 0.6 -0.5 -0.7 1
0.9 0.7 0.4 -0.3 -0.8 0
0.1 0.3 0.5 -0.9 -0.2 0
0.2 0.5 0.6 -0.5 -0.6 1
I want to apply classification tree on that so I used the below code:
data$target<-factor(data$target)
# Create Training Data
train.ind <- sample(nrow(data), 0.7*nrow(data))
trainData<-data[train.ind,]
testData<-data[-train.ind,]
library("rpart")
tree <- rpart(target ~.,data=trainData)
ypred=predict(tree,testData)
library(caret)
#Print a confusion matrix
result <- confusionMatrix(ypred,testData$target)
Error:
data
andreference
should be factors with the same levels.
precision <- result$byClass['Pos Pred Value']
recall <- result$byClass['Sensitivity']
f_measure <- 2 * ((precision * recall) / (precision + recall))
#OR
f_measure <-result$byClass['F1']
But it did not work. I need to the precision, recall and F1 values but I don't know how to compute using "rpart" package result.
The function confusionMatrix
takes two arguments (data
and reference
) which have to be factors with the same levels. This is not the case with your code because the predict
function, when given an rpart
-object, returns a matrix with probabilities of class memberships for each sample by default. You need to specify to this function that you want a vector of predicted classes and convert this vector to a factor with the same levels as target
(0
and 1
).
This should do the trick:
ypred <- factor(predict(tree, testData[, -6], type="vector"),
levels = levels(testData$target))
library(caret)
confusionMatrix(ypred, testData$target)
Using factor(..., levels = levels(testData$target))
ensures that the levels are in the same order in both factors to avoid following warning:
Warning message: In confusionMatrix.default(ypred, testData$target) : Levels are not in the same order for reference and data. Refactoring data to match.