Search code examples
rdplyrtidyeval

Tidy eval for `by` in `dplyr::_join


I am writing a function to join two datasets using dplyr::_join where the by terms are parameters passed in without quotes. I have seen quite a few solutions to this issue, but all seem to be dated and/or deprecated:

  • Use rlang::quo_text or purrr::map_chr in this answer and here which is superceded in the docs here

  • Use dplyr::ensym in this answer which is listed as "no longer for normal usage" in the docs here

How would I accomplish this using the new tidy eval methods as found here which emphasizes the use of {{}}? It seems like tidy eval has been updated since these older answers were given, which is why I am asking again -- please let me know if I should simply stick with one of these old answers.

Here is a toy example to demonstrate my question:

data("iris")

iris2 <- iris %>%
  select(Species) %>%
  filter(Species != "versicolor") %>%
  mutate(id = case_when(Species == "virginica" ~ 1, T ~ 2)) %>%
  distinct()

### This is my desired output
joined_iris <- iris %>% inner_join(iris2, by = ("Species" = "Species"))

### This is an example function where I attempt to use tidy eval
# -- it does not work
join_iris <- function(data1, data2, join_col1, join_col2) {
  data_out <- data1 %>%
    inner_join(
      data2,
      by = c({{ join_col1 }} = {{ join_col2 }})
    )

  data_out
}

join_iris(iris, iris2, Species, Species)

Solution

  • Using join_by() from dplyr 1.1.0:

    join_iris <- function(data1, data2, col1, col2) {
      inner_join(
        data1,
        data2,
        by = join_by({{ col1 }} == {{ col2 }})
      )
    }