How to write a function with tidy eval when using the "arrow" R package (arrow::open_dataset) and dplyr verbs?

What I'm trying to do

I'm attempting to write a function that uses dplyr verbs and that takes an "arrow open dataset" as the first argument, and a column in that dataset as the second argument. Since I would like to pass the column as a string (necessary for the context of my actual task I'm working on, i.e. Shiny), I'm using the syntax .data[[.column]]. Below is an image of the error I'm getting and some code to reproduce said error. Any help or insight is appreciated.

Code to reproduce error

# install.packages(c("dplyr", "ggplot2", "arrow"))

arrow::write_parquet(x = ggplot2::mpg, sink = "sample_data.parquet")

dat <- arrow::open_dataset("sample_data.parquet")


get_metric <- function(.data, .metric) {
  .data %>%
    group_by(manufacturer, cyl) %>% 
      new_col = sum(.data[[.metric]], na.rm = T)
    ) %>% 

get_metric(dat, "cty") %>% collect()

Additional code that works but doesn't use arrow as much so not ideal for speed

In this code I collect before the tidy eval stuff so its just essentially regular dplyr code. It runs, but is a slower than code that I've successfully gotten to run before extracting stuff into said function.

get_metric2 <- function(.data, .metric) {
  .data %>%
    collect() %>% 
    group_by(manufacturer, cyl) %>% 
      new_col = sum(.data[[.metric]], na.rm = T)
    ) %>% 

get_metric2(dat, "cty")


  • Use the !! nomenclature.

    arrow::write_parquet(x = ggplot2::mpg, sink = "sample_data.parquet")
    dat <- arrow::open_dataset("sample_data.parquet")
    get_metric <- function(.data, .metric) {
      .metric <- rlang::sym(.metric)
      .data %>%
        group_by(manufacturer, cyl) %>% 
          new_col = sum(!!.metric, na.rm = T)
        ) %>% 
    get_metric(dat, "cty") %>%
    # # A tibble: 32 × 3
    #    manufacturer   cyl new_col
    #    <chr>        <int>   <int>
    #  1 audi             4     153
    #  2 audi             6     148
    #  3 audi             8      16
    #  4 chevrolet        8     191
    #  5 chevrolet        4      41
    #  6 chevrolet        6      53
    #  7 dodge            4      18
    #  8 dodge            6     225
    #  9 dodge            8     243
    # 10 ford             8     197
    # # ℹ 22 more rows
    # # ℹ Use `print(n = ...)` to see more rows