Search code examples
rfunctionlazy-evaluationlm

Passing data-variables to R formulas


Let's say I'd like to write anscombe %>% lm_tidy("x1", "y1") (Actually, I'd like to write anscombe %>% lm_tidy(x1, y1), where x1 and y1 are part of the data frame). So, as the following function seems working:

plot_gg <- function(df, x, y) {
  x <- enquo(x)
  y <- enquo(y)
  ggplot(df, aes(x = !!x, y = !!y)) + geom_point() +
    geom_smooth(formula = y ~ x, method="lm", se = FALSE)
}

I started writing the following function:

lm_tidy_1 <- function(df, x, y) {
  x <- enquo(x)
  y <- enquo(y)
  fm <- y ~ x            ##### I tried many stuff here!
  lm(fm, data=df)
}
## Error in model.frame.default(formula = fm, data = df, drop.unused.levels = TRUE) : 
##   object is not a matrix

One comment in passing in column name as argument states that embrace {{...}} is a shorthand notation for the quote-unquote pattern. Unfortunately, error messages were different in both situations:

lm_tidy_2 <- function(df, x, y) {
  fm <- !!enquo(y) ~ !!enquo(x) # alternative: {{y}} ~ {{x}} with different errors!!
  lm(fm, data=df)
}
## Error:
## ! Quosures can only be unquoted within a quasiquotation context.

This seems working (based on @jubas's answer but we're stuck with string handling and paste):

lm_tidy_str <- function(df, x, y) {
  fm <- formula(paste({{y}}, "~", {{x}}))
  lm(fm, data=df)
}

Yet again, {{y}} != !!enquo(y). But it's worse: the following function breaks down with the same Quosure error as earlier:

lm_tidy_str_1 <- function(df, x, y) {
  x <- enquo(x)
  y <- enquo(y)
  fm <- formula(paste(!!y, "~", !!x))
  lm(fm, data=df)
}
  1. Is {{y}} != !!enquo(y)?
  2. How to pass data-variables to lm?

EDIT: Sorry, there were left-overs from my many trials. I want to directly pass the data-variables (say x1 and y1) to the function that is going to use them as formula components (such as lm) and not their string versions ("x1" and "y1"): I try to avoid strings as long as possible and it's more streamlined from the user perspective.


Solution

  • Wrap the formula in "expr," then evaluate it.

    library(dplyr)
    lm_tidy <- function(df, x, y) {
      x <- sym(x)
      y <- sym(y)
      fm <- expr(!!y ~ !!x)
      lm(fm, data = df)
    }
    

    This function is equivalent:

    lm_tidy <- function(df, x, y) {
      fm <- expr(!!sym(y) ~ !!sym(x))
      lm(fm, data = df)
    }
    

    Then

    lm_tidy(mtcars, "cyl", "mpg")
    

    gives

    Call:
    lm(formula = fm, data = .)
    
    Coefficients:
    (Intercept)          cyl  
         37.885       -2.876  
    

    EDIT per comment below:

    library(rlang)
    lm_tidy_quo <- function(df, x, y){
        y <- enquo(y)
        x <- enquo(x)
        fm <- paste(quo_text(y), "~", quo_text(x))
        lm(fm, data = df)
    }
    

    You can then pass symbols as arguments

    lm_tidy_quo(mtcars, cyl, mpg)