I am trying to use decision tree classification on my dataset which contains 2 features and 1 dependent variable that looks like: Age Salary Purchased(Y/N)
26 43000 0
17 57000 0
19 76000 0
27 58000 0
27 84000 0
32 150000 1
25 33000 0
If I use
classifier = rpart(formula = Purchased ~ ., data = training_set)
I get the result like
2 4 5 9
0.03296703 0.03296703 0.03296703 0.03296703
I need to get not the probability but the most likely result. But when I use
y_pred = predict(classifier, newdata = test_set[-3], type = 'class')
I get
Error in predict.rpart(classifier, newdata = test_set[-3], type = "class") : Invalid prediction for "rpart" object
Can you help me with that?
Got the solution. I should have encoded the dependent variable as a factor:
dataset$Purchased = factor(dataset$Purchased, levels = c(0, 1))
After adding this line everything works fine.