Search code examples
rplotregressionr-caret

How to use ggRandomForests dependence plots for caret model in R?


I want to create a facet of partial dependence plots for my model using the gg_variable function from ggRandomForests package. I have the following but it does not work.

How can I do this?

library("caret")
library("ggRandomForests")
library("randomForest") 
data("iris")
iris$Species<-NULL
control = trainControl(method="cv", number=5,savePredictions = TRUE)  
in_train= createDataPartition(iris$Sepal.Length, p=.66, list=FALSE) 
train_st=iris[in_train,]
test_st=iris[-in_train,] 
trf_sep = train(Sepal.Length ~ .,
          data=train_st,ntree=800,method="rf",metric="Rsquared",trControl=control,importance = TRUE)
gg_variable(trf_sep)#Here is the problem

Solution

  • gg_variable requires output from randomForest model. It does not work with the output of caret::train function. Under such situation you can use the train function from caret package to tune mtry and fit the random forest model with randomForest package with tuned mtry and then apply gg_variable on that like

    library("caret")
    library("ggRandomForests")
    library("randomForest") 
    data("iris")
    iris$Species<-NULL
    control = trainControl(method="cv", number=5,savePredictions = TRUE)  
    in_train= createDataPartition(iris$Sepal.Length, p=.66, list=FALSE) 
    train_st=iris[in_train,]
    test_st=iris[-in_train,] 
    trf_sep = train(Sepal.Length ~ .,
                    data=train_st,ntree=800,method="rf",metric="Rsquared",trControl=control,importance = TRUE)
    try <- randomForest(Sepal.Length ~ .,
                 data=train_st,ntree=800, mtry=3)
    
     gg_dta <- gg_variable(try)
    
    plot(gg_dta, xvar=c("Sepal.Width","Petal.Length","Petal.Width"), 
         panel=TRUE) 
    

    enter image description here