Search code examples
rdplyrtidyversetidyselect

How to use dynamic tidy-select expressions in dplyr::select()?


I need to select variables dynamically on multiple expressions. Consider the following example:

library(tidyverse)
set.seed(42)

df <- tibble(
  grey_dog = runif(n = 69),
  white_bear = runif(n = 69),
  blue_oyster = runif(n = 69),
  white_lobster = runif(n = 69),
  green_dog = runif(n = 69)
)

df %>% 
  dplyr::select(
    (contains("dog") & contains("green")) | 
    (contains("white") & contains("bear"))
  )

Instead of explicitly selecting, I have vectors containing the information I want to base my selection on:

x <- c(green = "dog", white = "bear")

So I was hoping to concatenate a string that can be used as a tidy-select:

s <- paste0("(", paste0("contains(", names(x),") & contains(", x, ")"), ")", collapse = " | ")
dplyr::select(df, s)

This fails with:

Error: Can't subset columns that don't exist.
x Column `(contains(green) & contains(dog)) | (contains(white) & contains(bear))` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

Any ideas on how to accomplish this?


Solution

  • You can't just pass in a string. A string is not the same as an expression. One way is to use purrr and rlang to build the expression and then inject that into the select

    library(purrr)
    library(rlang)
    query <- map2(
      map(x, ~expr(contains(!!.x))),
      map(names(x), ~expr(contains(!!.x))),
      ~expr((!!.x & !!.y))) %>% 
    reduce(~expr(!!.x | !!.y))
    dplyr::select(df, !!query)
    

    Though if you really wanted to build the code as a string, then you would just need to parse that string into an expression first using rlang::parse_expr. You just need to add some quotes to your string so it exactly matches the code you used before

    s <- paste0("(", paste0("contains(\"", names(x),"\") & contains(\"", x, "\")"), ")", collapse = " | ")
    dplyr::select(df, !!rlang::parse_expr(s))