Search code examples
rtidyevalnsedata-maskingquosure

How do I quote a newly created variable in a function to a helper function?


Question

What is the proper way to quote a parameter in a function that will be used to create a new variable that will be passed to another function?

Background

Ultimate goal is to create labels in a dataframe for a treemap with 2 levels of hierarchy, and I'm trying to create a reusable function. Here's a more basic example:

Example

library(scales)
library(tidyverse)

# Create dataframe
region = rep(c("North", "South"), 3)
district <- sprintf("Dist-%d", 1:6)
sales <- seq(2000, 1500000000, length.out = 6)

df <- tibble(region, district, sales)
df
# A tibble: 6 × 3
  region district      sales
  <chr>  <chr>         <dbl>
1 North  Dist-1         2000
2 South  Dist-2    300001600
3 North  Dist-3    600001200
4 South  Dist-4    900000800
5 North  Dist-5   1200000400
6 South  Dist-6   1500000000

I created this helper function to format the currency. It will be used in the main function, and my issue is related to passing a new variable name from the main function to this helper:

# First function for formatting currency
mydollars <- scales::label_dollar(prefix = "$",
                          largest_with_cents = 5000,
                          scale_cut = c(0, " K" = 1e3, " M" = 1e6, " B" = 1e9, " T" = 1e12)
)
# Example function output
mydollars(df$sales)
[1] "$2 K"   "$300 M" "$600 M" "$900 M" "$1.2 B" "$1.5 B"

This is the main function that utilizes the above helper. I'm passing a dataframe to the function, creating the 2nd level ".index" label, then I group and aggregate the number column, which I'm appending "2" suffix so I know it's the second number, and my problem arises from inside the paste() with mydollars("{{agg_number}}2"). If I replace that code with "Test String", I get the function to work.

treemap_index1 <- function(df, category1, category2, agg_number){
  
  df_out <- df %>% 
    mutate("{{category2}}.index" := paste({{category2}}, mydollars({{agg_number}}), sep = "\n")) %>% 
    group_by({{category1}}) %>%
    mutate("{{agg_number}}2" := sum({{agg_number}}),
           "{{category1}}.index" := paste({{category1}}, 
                                          mydollars("{{agg_number}}2"), # Code breaks on this line
                                          sep = "\n")) %>%
    print()
  
  return(df_out)
  
}

treemap_index1(df, region, district, sales)

 rlang::last_error()
<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `region.index = paste(region, mydollars("{{agg_number}}2"), sep = "\n")`.
ℹ The error occurred in group 1: region = "North".
Caused by error in `x * scale`:
! non-numeric argument to binary operator
---
Backtrace:
  1. global treemap_index1(df, region, district, sales)
 10. scales (local) mydollars("{{agg_number}}2")
 11. scales::dollar(...)
 12. scales::number(...)
 13. scales:::scale_cut(...)
 14. base::cut(...)
Run `rlang::last_trace()` to see the full context.

If I replace the offending code as seen below, the function would otherwise work:

treemap_index2 <- function(df, category1, category2, agg_number){
  
  df_out <- df %>% 
    mutate("{{category2}}.index" := paste({{category2}}, mydollars({{agg_number}}), sep = "\n")) %>% 
    group_by({{category1}}) %>%
    mutate("{{agg_number}}2" := sum({{agg_number}}),
           "{{category1}}.index" := paste({{category1}}, 
                                          "Test String", # Temporarily replaced code
                                          sep = "\n")) %>%
    print()
  
  return(df_out)
  
}
treemap_index2(df, region, district, sales)

# A tibble: 6 × 6
# Groups:   region [2]
  region district      sales district.index       sales2 region.index        
  <chr>  <chr>         <dbl> <chr>                 <dbl> <chr>               
1 North  Dist-1         2000 "Dist-1\n$2 K"   1800003600 "North\nTest String"
2 South  Dist-2    300001600 "Dist-2\n$300 M" 2700002400 "South\nTest String"
3 North  Dist-3    600001200 "Dist-3\n$600 M" 1800003600 "North\nTest String"
4 South  Dist-4    900000800 "Dist-4\n$900 M" 2700002400 "South\nTest String"
5 North  Dist-5   1200000400 "Dist-5\n$1.2 B" 1800003600 "North\nTest String"
6 South  Dist-6   1500000000 "Dist-6\n$1.5 B" 2700002400 "South\nTest String"

Assistance appreciated...

I would appreciate guidance on how to properly pass the new variable name to the helper function, and as I am new to data-masking, quosures, non-standard evaluation, any other comments on how this could be done better are appreciated. Thank you.


Solution

  • Adapting the answer by Lionel Henry (@LionelHenry) one option would be to use rlang::englue and the .data pronoun like so:

    library(scales)
    library(tidyverse)
    
    treemap_index1 <- function(df, category1, category2, agg_number) {
      df %>%
        mutate("{{category2}}.index" := paste({{ category2 }}, mydollars({{ agg_number }}), sep = "\n")) %>%
        group_by({{ category1 }}) %>%
        mutate(
          "{{agg_number}}2" := sum({{ agg_number }}),
          "{{category1}}.index" := paste(
            {{ category1 }},
            mydollars(.data[[rlang::englue("{{agg_number}}2")]]),
            sep = "\n"
          )
        )
    }
    
    treemap_index1(df, region, district, sales)
    #> # A tibble: 6 × 6
    #> # Groups:   region [2]
    #>   region district      sales district.index       sales2 region.index 
    #>   <chr>  <chr>         <dbl> <chr>                 <dbl> <chr>        
    #> 1 North  Dist-1         2000 "Dist-1\n$2 K"   1800003600 "North\n$2 B"
    #> 2 South  Dist-2    300001600 "Dist-2\n$300 M" 2700002400 "South\n$3 B"
    #> 3 North  Dist-3    600001200 "Dist-3\n$600 M" 1800003600 "North\n$2 B"
    #> 4 South  Dist-4    900000800 "Dist-4\n$900 M" 2700002400 "South\n$3 B"
    #> 5 North  Dist-5   1200000400 "Dist-5\n$1.2 B" 1800003600 "North\n$2 B"
    #> 6 South  Dist-6   1500000000 "Dist-6\n$1.5 B" 2700002400 "South\n$3 B"