Search code examples
rgroup-bydplyrnse

group_by and group_by_ in pipes


I am writing a function that can group and concatenate variables using the dplyr package:

basket<-function(dataframe, group, target)
{
  dataframe %>% 
    group_by_(group) %>% 
    summarise(new_target=paste(as.character(target), collapse="_"))

}

I am using the mtcars dataset for testing:

basket(mtcars, mtcars$am, mtcars$wt)

The desired output should be something like this:

am     wt
0      2.62_2.875_2.32...
1      3.215_3.19_3.44...

However, in my code the group_by_ function fails to successfully create groups based on "am". The result I get is simply a concatenated string of all values of "wt":

[1] "2.62_2.875_2.32_3.215_3.44_3.46_3.57_3.19_3.15_3.44_3.44_4.07_3.73_3.78...

If I use group_by then I'll receive this error:

stop(structure(list(message = "unknown variable to group by : group", 
call = resolve_vars(new_groups, tbl_vars(.data)), cppstack = structure(list(
    file = "", line = -1L, stack = "C++ stack not available on this system"), .Names = c("file", 
"line", "stack"), class = "Rcpp_stack_trace")), .Names = c("message",  ... 

Has anybody seen this problem before?


Solution

  • You'll need the SE versions of both group_by and summarise, and supply the values quoted (""). Don't use dollar notation with dplyr when referring to variables in the data.frame at hand.

    basket<-function(dataframe, group, target) {
      dataframe %>% 
        group_by_(group) %>% 
        summarise_(new_target = lazyeval::interp(~paste(as.character(x), collapse="_"), 
                                                 x = as.name(target)))
    }
    
    basket(mtcars, "am", "wt")
    
    # A tibble: 2 × 2
         am                                                                                           new_target
      <dbl>                                                                                                <chr>
    1     0 3.215_3.44_3.46_3.57_3.19_3.15_3.44_3.44_4.07_3.73_3.78_5.25_5.424_5.345_2.465_3.52_3.435_3.84_3.845
    2     1                                 2.62_2.875_2.32_2.2_1.615_1.835_1.935_2.14_1.513_3.17_2.77_3.57_2.78
    

    Also see vignette('nse').