Search code examples
rfor-loopdefaultrandom-forest

Default value when calling a function in a for loop


Suppose I want to iterate over different values with a for loop in a function (e.g. randomForest)

for (i in c(100, 200, 500)){
randomForest(Predictor ~., data = train, ntree = i)}

One of the values, passed to randomForest function that I want to evaluate is the default value (suppose that I don't know that the default value of ntree in randomForest is 500)

How can I specify that in the for loop?

for (i in c(100,200, default)){
randomForest(Predictor ~., data = train, ntree = i)}

Solution

  • You could look up the value with formals, which gives you a list which includes all the default values. But it comes with its own set of problems, as not all functions handle things the exact same way.

    The first problem becomes clear in your example: formals(randomForest) only gives you x and ..., both without defaults. That is because randomForest is a generic method, which accepts different arguments based on the class of the first one. To get the default for ntree, you need

    formals(randomForest:::randomForest.default)$ntree
    

    Some more problems I can think of:

    • It may not be even clear what a missing or default value is. Ever seen the difference between somedataframe[1] and somedataframe[1,] or somedataframe[,1]? What is the default?
    • What about optional arguments that are given another value inside a function? Take for example a plot: if you don't specify anything, it generates its own title. But what is the "default" title?
    • For some functions, there is a difference as to where an argument comes from, where it is evaluated. This matters particularly when dealing with environments, so that match.call() and match.call with all arguments filled in as specified by the defaults, will give different results.

    So put all together, I think you're better off just placing a call outside the loop, or calling with an if ... else