I build an decision tree model using R and I want to add a new column to tree when the predicted value grater than 50% and print yes in this column
N.B: the target coulmun in dataset is boolean 1 = heart disease and 0 = normal
library(rpart)
tree<-rpart(target ~ .,method ='class', data=train)
print(summary(tree))
tree.preds<-predict(tree,test)
print(head(tree.preds))
tree.preds<-as.data.frame(tree.preds)
joiner<-function(x){
if(x>=0.5)
return('yes')
else
return('no')
}
tree.preds$disease<-sapply(tree.preds$yes,joiner)
print(head(tree.preds))
this error appear after run:
Error in `$<-.data.frame`(`*tmp*`, t, value = list()) :
replacement has 0 rows, data has 91
You can use an ifelse
inplace of iterating with an sapply
:
library(rpart)
dat = iris[,-5]
dat$target = as.numeric(iris$Species=="versicolor")
idx = sample(nrow(dat),100)
train = dat[idx,]
test = dat[-idx,]
tree = rpart(target ~ .,method ='class', data=train)
tree.preds = data.frame(predict(tree,test))
tree.preds$Species = ifelse(tree.preds[,2]>0.5,"yes","no")