Search code examples
rclassificationwekadecision-treej48

J48 tree in R - train and test classification


I want to use train and test in J48 decision-tree on R. here is my code:

library("RWeka")

data <- read.csv("try.csv")
resultJ48 <- J48(classificationTry~., data)

summary(resultJ48)

but I want to split my data into 70% train and 30% test, how can I use the J48 algo to do it?

many thanks!


Solution

  • use the sample.split() function of the caTools package. It is more leightweight than the caret package (which is a meta package if I remember correctly):

    library(caTools)
    
    library(RWeka)
    
    data <- read.csv("try.csv")
    spl = sample.split(data$someAttribute, SplitRatio = 0.7)
    
    dataTrain = subset(data, spl==TRUE)
    dataTest = subset(data, spl==FALSE)
    
    resultJ48 <- J48(as.factor(classAttribute)~., dataTrain) 
    dataTest.pred <- predict(resultJ48, newdata = dataTest)
    table(dataTest$classAttribute, dataTest.pred)