Search code examples
rmachine-learningrandom-forestr-caret

Setting ntree and mtry explicitly in Random Forest with caret


I am trying to explicitly pass the number of trees and mtry into the Random Forest algorithm with caret:

library(caret)
library(randomForest)
repGrid<-expand.grid(.mtry=c(4),.ntree=c(350))
controlRep <- trainControl(method="cv",number = 5)

rfClassifierRep <- train(label~ .,
                      data=overallDataset,
                      method="rf",
                      metric="Accuracy",
                      trControl=controlRep,
                      tuneGrid = repGrid,)

I get this error:

Error: The tuning parameter grid should have columns mtry

I tried doing the more sensible way first:

rfClassifierRep <- train(label~ .,
                      data=overallDataset,
                      method="rf",
                      metric="Accuracy",
                      trControl=controlRep,
                      ntree=350,
                      mtry=4,
                      tuneGrid = repGrid)

But that resulted in an error stating that I had too many hyperparameters. This is why I have tried to make a 1x1 grid.


Solution

  • ntree cannot be part of tuneGrid for Random Forest, only mtry (see the detailed catalog of tuning parameters per model here); you can only pass it through train. And inversely, since you tune mtry, the latter cannot be part of train.

    All in all, the correct combination here is:

    repGrid <- expand.grid(.mtry=c(4))  # no ntree
    
    rfClassifierRep <- train(label~ .,
                          data=overallDataset,
                          method="rf",
                          metric="Accuracy",
                          trControl=controlRep,
                          ntree=350, 
                          # no mtry
                          tuneGrid = repGrid)