Search code examples
rdplyrpurrrrlang

Use vector of columns in custom dplyr function


I'm trying to use the tidyverse to make a function that allows me to take a vector of column names (of class factor), count how many observations meet a certain condition (value == "yes"), mutate() and make new columns using this sum so I can summarise the data later.

I have written a function that can do this for a single column, but I want to be able to pass a vector of column names of any length using all_of() grammar to the function. When I try this, I get a new column with the same name as the vector (including my _count suffix) not the values inside the vector.

This is similar to TimTeaFan's Post but I would like to pass a vector where length > 1.

I think I might need to use the (. . .) option in dplyr and some more rlang, but I haven't quite figured out the right combination. The function doesn't work when I substitute (. . .) for objective thanks to NSE. I've also tried variations of rlang::as_name() and rlang::get_env(). There was an outdated workbook that used purrr::map() as well, but I haven't had any luck implementing it here.

I either got Error: Column x is not found. or Error: Promise has already been forced

Here is a reproducible example with data

dat <- tibble(category = rep(letters[1:10], 2),
              intake = factor(c(rep("no", 12), rep("yes", 8))), 
              outtake = factor(c(rep("yes", 11), rep("no", 9))),
              pretake = factor(c(rep(c("no", "yes"), 10))))

yessum <- function(.data, objective) {
  .data %>%
    dplyr::mutate("{{objective}}_count" := sum(
      ifelse(
        unlist(!!rlang::ensym(objective)) == "yes", 1, 0)))
}

dat %>%
  group_by(category) %>%
  yessum(intake)

I want to be able to pass a vector of certain column names to yessum and receive a set of new columns just like intake_new but named outtake_new and pretake_new.

Here's what currently happens when I try that:

vars <- c("intake", "outtake", "pretake")

dat %>%
  group_by(category) %>%
  yessum(vars)

Any and all help is welcome!


Solution

  • You don't necessarily need the function, as you can just mutate across the columns and get sums for each category.

    library(tidyverse)
    
    dat %>%
      group_by(category) %>%
      mutate(across(ends_with("take"), .fns = list(count = ~sum(. == "yes"))))
    

    Or if you have a long list, then you can use vars directly in the across statement:

    vars <- c("intake", "outtake", "pretake")
    
    dat %>%
      group_by(category) %>%
      mutate(across(vars, .fns = list(count = ~sum(. == "yes"))))
    

    Output

      category intake outtake pretake intake_count outtake_count pretake_count
       <chr>    <fct>  <fct>   <fct>          <int>         <int>         <int>
     1 a        no     yes     no                 0             2             0
     2 b        no     yes     yes                0             1             2
     3 c        no     yes     no                 1             1             0
     4 d        no     yes     yes                1             1             2
     5 e        no     yes     no                 1             1             0
     6 f        no     yes     yes                1             1             2
     7 g        no     yes     no                 1             1             0
     8 h        no     yes     yes                1             1             2
     9 i        no     yes     no                 1             1             0
    10 j        no     yes     yes                1             1             2
    11 a        no     yes     no                 0             2             0
    12 b        no     no      yes                0             1             2
    13 c        yes    no      no                 1             1             0
    14 d        yes    no      yes                1             1             2
    15 e        yes    no      no                 1             1             0
    16 f        yes    no      yes                1             1             2
    17 g        yes    no      no                 1             1             0
    18 h        yes    no      yes                1             1             2
    19 i        yes    no      no                 1             1             0
    20 j        yes    no      yes                1             1             2