Search code examples
rfunctionnon-standard-evaluation

R eval(predvars, data, env) object not found by passing a pameter in a function


My reproducible example is as follows;

please do not bother at all the underlying meaning of the calculations (none, actually) because it is just an extract of my real dataset;

train <- structure(list(no2 = c(25.5, 31.2, 33.4, 29.9, 31.8),
                        vv_scal = c(1.3, 1.3, 0.8, 1.1, 0.9), 
                        temp = c(-0.7, -2, 1.5, 0.4, 1.1), 
                        prec = c(0, 11, 9, 3, 0), 
                        co = c(1.6, 2.9, 3.2, 2.6, 3)), 
                        row.names = c(NA, -5L), 
                        class = c("tbl_df", "tbl", "data.frame"))


test <- structure(list(no2 = c(41.6, 41.4, 46.6, 44.7, 43.2), 
                       vv_scal = c(1.2, 1.2, 1.2, 1, 1), 
                       temp = c(0.9, 1, 0.1, 1.6, 3.8), 
                       prec = c(0, 0, 0, 0, 0), 
                       co = c(4.3, 4.3, 4.9, 4.7, 4.5)), 
                       row.names = c(NA, -5L), 
                       class = c("tbl_df", "tbl", "data.frame"))
                       
                       

forest_ci <- function(B, train_df, test_df, var_rf){
  
  # Initialize a matrix to store the predicted values
  predictions <- matrix(nrow = B, ncol = nrow(test_df))
  
  # bootstrapping predictions
  for (b in 1:B) {
    
    # Fit a random forest model
    model <- randomForest::randomForest(var_rf~., data = train_df) # not working
    #model <- randomForest::randomForest(no2~., data = train_df)   # working
    
    # Store the predicted values from the resampled model
    predictions[b, ] <- predict(model, newdata = test_df)
    
  }
  
  predictions
  
}

predictions <- forest_ci(B=2, train_df=train, test_df=test, var_rf = no2)

I've got the following error message:

Error in eval(predvars, data, env) : object 'no2' not found

I think understanding the error has somehow to do with the concept of "non-standard evaluation" and the "capturing expressions"

http://adv-r.had.co.nz/Computing-on-the-language.html

Following the suggestion of some threads, here follows some of them:

how do I pass a variable name to an argument in a function

Passing a variable name to a function in R

I've been trying the use of different combinations of the functions: substitute(), eval(), quote() but without much success;

I know the subject has already been covered here but I could not find a proper solution so far;

my objective is to pass the name of a variable inside a function argument to be evaluated inside the regression (and prediction) provided by the Random Forest model

Thanks


Solution

  • Try using ensym() and inject() from rlang:

    forest_ci <- function(B, train_df, test_df, var_rf){
      
      y = rlang::ensym(var_rf)
      
      # Initialize a matrix to store the predicted values
      predictions <- matrix(nrow = B, ncol = nrow(test_df))
      
      # bootstrapping predictions
      for (b in 1:B) {
        
        # Fit a random forest model
        model <- rlang::inject(randomForest::randomForest(!!y~., data = train_df)) # not working
        #model <- randomForest::randomForest(no2~., data = train_df)   # working
        
        # Store the predicted values from the resampled model
        predictions[b, ] <- predict(model, newdata = test_df)
        
      }
      
      predictions
      
    }