Search code examples
rmachine-learningdecision-tree

How to resolve when I build decision tree using R


I build an decision tree model using R and I want to add a new column to tree when the predicted value grater than 50% and print yes in this column

N.B: the target coulmun in dataset is boolean 1 = heart disease and 0 = normal

library(rpart)
tree<-rpart(target ~ .,method ='class', data=train)
print(summary(tree))
tree.preds<-predict(tree,test)
print(head(tree.preds))

tree.preds<-as.data.frame(tree.preds)
joiner<-function(x){
  if(x>=0.5)
    return('yes')
    
  else
    return('no')
      
}
tree.preds$disease<-sapply(tree.preds$yes,joiner)

print(head(tree.preds))

this error appear after run:

Error in `$<-.data.frame`(`*tmp*`, t, value = list()) : 
  replacement has 0 rows, data has 91

Solution

  • You can use an ifelse inplace of iterating with an sapply:

    library(rpart)
    
    dat = iris[,-5]
    dat$target = as.numeric(iris$Species=="versicolor")
    idx = sample(nrow(dat),100)
    train = dat[idx,]
    test = dat[-idx,]
    
    tree = rpart(target ~ .,method ='class', data=train)
    tree.preds = data.frame(predict(tree,test))
    tree.preds$Species = ifelse(tree.preds[,2]>0.5,"yes","no")