Search code examples
rdplyrleft-joinrlangquosure

Joining two data sets using as_label / as_name instead of quo_name within dplyr's left_join


Following from the answer on join datasets using a quosure as the by argument which suggests using: quo_name in order to join tables using quosures; I would like to arrive at identical result using as_name / as_label as quo_name's is currently within questioning lifecycle stage

These functions are in the questioning life cycle stage.

as_label() and as_name() should be used instead of quo_name(). as_label() transforms any R object to a string but should only be used to create a default name. Labelisation is not a well defined operation and no assumption should be made about the label. On the other hand, as_name() only works with (possibly quosured) symbols, but is a well defined and deterministic operation.

Example

library("tidyverse")

data_a <- tibble(col_ltr = letters, col_nums = seq_along(letters))
data_b <- tibble(col_ltr = letters, col_nums = seq_along(letters) * -1)


clean_and_join <-
    function(data_one,
             data_two,
             column_id_one,
             column_id_two,
             col_nums_one,
             col_nums_two) {

        clean_data_one <- filter(data_one, {{col_nums_one}} %% 2 == 0)
        clean_data_two <- filter(data_two, {{col_nums_two}} %% 2 != 0)

        by_cols <- set_names(as_label({{column_id_one}}), as_label({{column_id_two}}))

        left_join(
            x = clean_data_one,
            y = clean_data_two,
            by = by_cols
        )
    }

clean_and_join(data_one = data_a, data_two = data_b, column_id_one = col_ltr,
               column_id_two = col_ltr, col_nums_one = col_nums,
               col_nums_two = col_nums)

Error

Error in is_quosure(x) : object 'col_ltr' not found

Desired results

left_join(
    x = clean_data_one,
    y = clean_data_two,
    by = c("col_ltr" = "col_ltr") # Or by = "col_ltr" in case of identical name
)

Solution

  • An option would to convert to symbol and then to string with as_string

    clean_and_join <-
        function(data_one,
                 data_two,
                 column_id_one,
                 column_id_two,
                 col_nums_one,
                 col_nums_two) {
    
            clean_data_one <- filter(data_one, {{col_nums_one}} %% 2 == 0)
            clean_data_two <- filter(data_two, {{col_nums_two}} %% 2 != 0)
    
            by_cols <- set_names(rlang::as_string(rlang::ensym(column_id_one)), 
                     rlang::as_string(rlang::ensym(column_id_two)))
    
            left_join(
                x = clean_data_one,
                y = clean_data_two,
                by = by_cols
            )
        }
      
    

    -testing

    clean_and_join(data_one = data_a, data_two = data_b, column_id_one = col_ltr,
                    column_id_two = col_ltr, col_nums_one = col_nums,
                    col_nums_two = col_nums)
    # A tibble: 13 x 3
       col_ltr col_nums.x col_nums.y
    #   <chr>        <int>      <dbl>
    # 1 b                2         NA
    # 2 d                4         NA
    # 3 f                6         NA
    # 4 h                8         NA
    # 5 j               10         NA
    # 6 l               12         NA
    # 7 n               14         NA
    # 8 p               16         NA
    # 9 r               18         NA
    #10 t               20         NA
    #11 v               22         NA
    #12 x               24         NA
    #13 z               26         NA