Search code examples
rdplyrsummarize

Dplyr group_by and summarise, but keep non numeric variables


I have a dataset in a long format, where I add up values for different group. Some variables are factor variables and should be kept in the result.

mtcars$model <- as.factor(rownames(mtcars))
longmtcars <- rbind(mtcars, mtcars, mtcars)

longmtcars$vs <- ifelse(longmtcars$vs == 1, "Yes", "No")

result <- longmtcars %>%
    group_by(factor(model)) %>%
    summarise_if(is.numeric, sum)
result

# A tibble: 32 x 11
   `factor(model)`      mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
   <fct>              <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 AMC Javelin         45.6    24  912    450  9.45 10.3   51.9     0     9     6
 2 Cadillac Fleetwood  31.2    24 1416    615  8.79 15.8   53.9     0     9    12
 3 Camaro Z28          39.9    24 1050    735 11.2  11.5   46.2     0     9    12
 4 Chrysler Imperial   44.1    24 1320    690  9.69 16.0   52.3     0     9    12
 5 Datsun 710          68.4    12  324    279 11.6   6.96  55.8     3    12     3

My current, non scaleable solution

#ugly solution

vsvar <- longmtcars[1:32, "vs"]
result <- cbind(result, vsvar)
result

         factor(model)   mpg cyl   disp   hp  drat     wt  qsec am gear carb vsvar
1          AMC Javelin  45.6  24  912.0  450  9.45 10.305 51.90  0    9    6    No
2   Cadillac Fleetwood  31.2  24 1416.0  615  8.79 15.750 53.94  0    9   12    No
3           Camaro Z28  39.9  24 1050.0  735 11.19 11.520 46.23  0    9   12   Yes

This is correct, but really ugly and I will use it in a Shiny App, which will cause trouble, so doing it the current way is no option. Is there in all-in-one solution? It may also be done with data.table, but I am not too familiar with it.


Solution

  • You could add that (those) variable(s) to the group_by clause:

    result <- longmtcars %>%
      mutate_if(is.character, factor) %>%
      group_by(model, vs) %>%
      summarise_if(is.numeric, sum)
    
    result
    #> # A tibble: 32 x 12
    #> # Groups:   model [32]
    #>    model              vs      mpg   cyl  disp    hp  drat    wt  qsec    am  gear  carb
    #>    <fct>              <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    #>  1 AMC Javelin        No     45.6    24  912    450  9.45 10.3   51.9     0     9     6
    #>  2 Cadillac Fleetwood No     31.2    24 1416    615  8.79 15.8   53.9     0     9    12
    #>  3 Camaro Z28         No     39.9    24 1050    735 11.2  11.5   46.2     0     9    12