Search code examples
rdplyrgroup-bypurrrnested-tibble

Group tibbles by variables given as string vectors in a nested tibble


In a nested tibble, I would like to group tibbles in a list-column (data in the example below) by variables given as string vectors (vars).

toydata <- tibble::tibble(
  vars = list(
    list("x"), 
    list(c("x", "y"))
    ),
  data = list(
    tibble::tibble(
      x = c(1,1,2,2),
      y = c(1,1,1,2)
    ),
    tibble::tibble(
      x = c(1,1,2,2),
      y = c(1,1,1,2)
    )
  )
)

This works:

purrr::map2(toydata$data, 
            toydata$vars, 
            ~ dplyr::group_by(.x, !!!rlang::syms(unlist(.y)))
)

But neither of these works:

toydata %>%
  dplyr::mutate(
    data = purrr::map2(toydata$data, 
                       toydata$vars, 
                       ~ dplyr::group_by(.x, !!!rlang::syms(unlist(.y)))
    )
  )

toydata %>%
  dplyr::mutate(
    data = purrr::map2(data, 
                       vars, 
                       ~ dplyr::group_by(.x, !!!rlang::syms(unlist(.y)))
                       )
  )

Where am I wrong?


Solution

  • You can use tidy-select verbs in group_by. Here you can use all_of (or any_of depending what you want):

     toydata %>%
      dplyr::mutate(
        data = purrr::map2(
          data, vars, ~dplyr::group_by(.x, across(all_of(unlist(.y))))
        )
      )