Search code examples
rlinear-regression

How to dynamically reference datasets in function call of linear regression


Let's say I have a function like this:

data("mtcars")
ncol(mtcars)

test <- function(string){
      fit <- lm(mpg ~ cyl,
                     data = string)
      return(fit)
}

I'd like to be able to have the "string" variable evaluated as the dataset for a linear regression like so:

test("mtcars")

However, I get an error:

Error in eval(predvars, data, env) : invalid 'envir' argument of type 'character'

I've tried using combinations of eval and parse, but to no avail. Any ideas?


Solution

  • You can use get() to search by name for an object.

    test <- function(string){
      fit <- lm(mpg ~ cyl, data = get(string))
      return(fit)
    }
    
    test("mtcars")
    
    # Call:
    # lm(formula = mpg ~ cyl, data = get(string))
    # 
    # Coefficients:
    # (Intercept)          cyl  
    #      37.885       -2.876 
    

    You can add one more line to make the output look better. Notice the change of the Call part in the output. It turns from data = get(string) to data = mtcars.

    test <- function(string){
      fit <- lm(mpg ~ cyl, data = get(string))
      fit$call$data <- as.name(string)
      return(fit)
    }
    
    test("mtcars")
    
    # Call:
    # lm(formula = mpg ~ cyl, data = mtcars)
    # 
    # Coefficients:
    # (Intercept)          cyl  
    #      37.885       -2.876