Search code examples
rid3rwekac4.5

How to use RWeka package on a dataset?


So I generated a random dataset online and I need to apply the C4.5 algorithm on it.
I installed the RWeka package and all its dependencies but I do not know how to execute it.
Can somebody help me with links to tutorials? Anything apart from the RWeka documentation. Or a sample C4.5 code in R to understand its working?
Thank you


Solution

  • I think it would be worth your time to check out the caret package. It standardizes the syntax for most machine learning packages in R, including RWeka.

    It also has a ton of really useful helper functions and a great tutorial on their website

    Here's the syntax for predicting Species on the iris dataset using the RWeka package with C4.5-like trees:

    library(caret)
    train_rows <- createDataPartition(iris$Species, list=FALSE)
    train_set <- iris[train_rows, ]
    test_set <- iris[-train_rows, ]
    
    fit.rweka <- train(Species ~ ., data=train_set, method='J48')
    pred <- predict(fit.rweka, newdata=test_set)
    

    then, if you want to try a gradient boosting machine or some other algorithm, just change to method='gbm'