Search code examples
rstringregressionformula

How to run regression when formula is given by a string?


Let's consider data following:

set.seed(42)
y <- runif(100)
df <- data.frame("Exp" = rexp(100), "Norm" = rnorm(100), "Wei" = rweibull(100, 1))

I want to perform linear regression but when formula is a string in format:

form <- "Exp + Norm + Wei"

I thought that I only have to use:

as.formula(lm(y~form, data = df))

However it's not working. The error is about variety in length of variables. (it seems like it still treats form as a string vector of length 1, but I have no idea why).

Do you know how I can do it ?


Solution

  • We can use paste to construct the formula, and use it directly on lm

    lm(paste('y ~', form), data = df)
    

    -output

    #Call:
    #lm(formula = paste("y ~", form), data = df)
    
    #Coefficients:
    #(Intercept)          Exp         Norm          Wei  
    #   0.495861     0.026988     0.046689     0.003612