Search code examples
rtidyeval

How do I create a function to mutate new columns with a variable name and "_pct"?


Using mtcars as an example. I would like to write a function that creates a count and pct column such as below -

library(tidyverse)

mtcars %>% 
  group_by(cyl) %>% 
  summarise(count = n()) %>% 
  ungroup() %>% 
  mutate(cyl_pct = count/sum(count))

This produces the output -

# A tibble: 3 x 3
    cyl count mpg_pct
  <dbl> <int>   <dbl>
1     4    11   0.344
2     6     7   0.219
3     8    14   0.438

However, I would like to create a function where I can specify the group_by column to be any column and the mutate column will be name the column name specified in the groub_by, and a _pct. So if I want to use disp, disp will be my group_by variable and the function will mutate a disp_pct column.


Solution

  • Assuming that the input is unquoted, convert to symbol with ensym, evaluate (!!) within group_by while converting the symbol into a string (as_string) and paste the prefix '_pct' for the new column name. In mutate we can use := along with !! to assign the column name from the object created ('colnm')

    library(stringr)
    library(dplyr)
    f1 <- function(dat, grp) {
            grp <- ensym(grp)
            colnm <- str_c(rlang::as_string(grp), '_pct')
            dat %>%
               group_by(!!grp) %>%
               summarise(count = n(), .groups = 'drop') %>%
               mutate(!! colnm := count/sum(count))
         }
    

    -testing

    f1(mtcars, cyl)
    # A tibble: 3 x 3
    #    cyl count cyl_pct
    #  <dbl> <int>   <dbl>
    #1     4    11   0.344
    #2     6     7   0.219
    #3     8    14   0.438