Search code examples
rdplyrrlang

Programming with mutate to create new data column


There is a data.frame like so:

df <- data.frame("Config" = c("C1","C1","C2","C2"), "SN1" = 1:4, "SN2" = 5:8)

I'm trying to make df %<>% mutate more generic. Here is an example:

df %<>%
  mutate(
    Tag=paste(
      Config,
      as.character(SN1),
      as.character(SN2),
      sep="_"
    )
  )

What is desired is to pass a vector c("Config", "SN1", "SN2") to the above mutate or an alternative that does the same job, namely inserting new column Tag into the above data.frame. Thank you for your help


Solution

  • Like I mentioned in a comment, this isn't a question about the operator %<>% but about using non-standard evaluation (NSE) in a dplyr function. There's a pretty good vignette on this, but it's still pretty tricky to get the hang of NSE/tidy evaluation.

    Also as I mentioned, what you're doing as an example is exactly what tidyr::unite does, so if that were all you needed, you don't actually need to write anything. But it's a good simple example to use.

    In this function custom_unite, the first argument is .data, the data frame being operated on (the custom for being able to pipe is for the first argument to be the data frame). Then ... captures a flexible number of bare column names to be pasted together, new_col is the bare column name of the column to create, and sep is passed along as-is to paste. (I inadvertently switched the order of arguments from tidyr::unite, which takes col, ... instead of ..., new_col.)

    You need to create quosures of your columns. For the single bare column new_col, you can use enquo, but for the flexible number of columns you use quos on ..., which you'll then splice with !!!.

    To create a new column, you'll assign with := instead of = to the unquoted quosure.

    library(tidyverse)
    
    custom_unite <- function(.data, ..., new_col, sep = "_") {
      cols <- quos(...)
      new_col_quo <- enquo(new_col)
    
      .data %>%
        mutate(!!new_col_quo := paste(!!!cols, sep = sep))
    }
    
    df %>%
      custom_unite(Config, SN1, SN2, new_col = Tag)
    #>   Config SN1 SN2    Tag
    #> 1     C1   1   5 C1_1_5
    #> 2     C1   2   6 C1_2_6
    #> 3     C2   3   7 C2_3_7
    #> 4     C2   4   8 C2_4_8
    

    Created on 2018-12-14 by the reprex package (v0.2.1)