Search code examples
rdplyr

In dplyr/`group_vars()` returns not match with the variables in `group_by()`


In dplyr/group_vars() returns not match with the variables in group_by(). I think the returns should same as the variables in group_by() , refrer to below code . How to understand it ? Thanks!

library(tidyverse)

group_vars() return cyl, I thinks it should be cyl and vs

mtcars %>%
  group_by(cyl, vs) %>%
   summarise(cyl_n = n()) %>%
  group_vars()

group_vars() return character(0), I thinks it should be cyl

mtcars %>%
  group_by(cyl) %>%
  summarise(cyl_n = n()) %>%
  group_vars()

Solution

  • group_vars works fine. The "issue" is that by default summarise will drop the last level of grouping (see ?summarise).

    You can see that group_vars works fine by dropping the summarise from your code:

    library(dplyr, warn = FALSE)
    
    mtcars %>%
      group_by(cyl, vs) %>%
      group_vars()
    #> [1] "cyl" "vs"
    
    mtcars %>%
      group_by(cyl) %>%
      group_vars()
    #> [1] "cyl"
    

    And if your desired result is to keep all levels of grouping after summarise you can set .groups="keep":

    mtcars %>%
      group_by(cyl, vs) %>%
      summarise(cyl_n = n(), .groups = "keep") %>%
      group_vars()
    #> [1] "cyl" "vs"
    
    mtcars %>%
      group_by(cyl) %>%
      summarise(cyl_n = n(), .groups = "keep") %>%
      group_vars()
    #> [1] "cyl"