Search code examples
rdplyrtibble

dplyr::group_by drops custom class


I am trying to implement a subclass of a tibble which comes with a custom printing method. I figured that dplyr::group_by silently drops my custom class with the effect that my S3 printing methods are not dispatching anymore.

As I assume that this behaviour is a feature and not a bug, I was wondering what the canonical way of dealing with this should be? Overloading dplyr::group_by? Or am I overlooking something very fundamental here?

My expectation would be that the grouped my_tbl also displays my custom header:

library(tibble)

## Define a custom subclass of tbl
my_tbl <- function(x) {
  new_tibble(x, class = "my_tbl")
}

## Define an own tbl_sum function
tbl_sum.my_tbl <- function(x) {
  c("My Fancy Header" = "Whooaaaa!")
}

## Header is printed as it should
(mt <- my_tbl(mtcars %>% dplyr::slice(1L)))
# # My Fancy Header: Whooaaaa!
#     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1    21     6   160   110   3.9  2.62  16.5     0     1     4     4


## However, not when we add a grouping structure
mt %>% dplyr::group_by(am)
# # A tibble: 1 × 11
# # Groups:   am [1]
#     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1    21     6   160   110   3.9  2.62  16.5     0     1     4     4


## Reason: group_by silently drops my custom class
class(mt) ##...vs...
# [1] "my_tbl"     "tbl_df"     "tbl"        "data.frame"

class(mt %>% dplyr::group_by(am)) 
# [1] "grouped_df" "tbl_df"     "tbl"        "data.frame"

Solution

  • You'll need to implement a group_by() method for your class. From the dplyr extension vignette:

    Note that group_by() and ungroup() don't use any of these generics and you'll need to provide methods for them directly, or rely on .by for per-operation grouping.