Group by 2 data elements and then provide a count of values within an additional column for each group

Hi I have the below dataset and code.

 df <- 
  read.table(textConnection("Area   CustomerAge CustomersInGroup
    A 28 1
    A 40 3
    A 70 2
    A 19 4
    B 13 3
    B 12 2
    B 72 1
    B 90 7"), header=TRUE)


df2 <- df %>% 
  mutate(
    # Create categories
    Customer_Age_Group = dplyr::case_when(
      CustomerAge <= 18            ~ "0-18",
      CustomerAge > 18 & CustomerAge <= 60 ~ "19-60",
      CustomerAge > 60             ~ ">60"
    ))

df2 %>% group_by(Area, Customer_Age_Group,.drop = FALSE) %>% 
  count() %>% 
  ungroup() %>% 
  complete(Area, Customer_Age_Group, fill=list(n=0)) %>% 
  arrange(Area, parse_number(Customer_Age_Group))

Problem is that it provides a final count of applicable rows per group - I would like to amend this so instead the final count is of the values of 'CustomersInGroup' - e.g. for group A 19-60 this would sum to 8 (the values of the 3 rows) rather than the number of 3 representing the number of rows. Any help very much appreciated.

The final output should look like the below - totals of CustomersInGroup grouped by Area and then CustomerAge

Area	CustomerAge	Totals
A	0-18	0
A	19-60	8
A	>60	2
B	0-18	5
B	19-60	0
B	>60	8

Solution

You could use summarise with sum to get the totals of CustomersInGroup. You can use the following code:

library(dplyr)
library(tidyr)
library(readr)

df2 <- df %>% 
  mutate(
    # Create categories
    Customer_Age_Group = dplyr::case_when(
      CustomerAge <= 18            ~ "0-18",
      CustomerAge > 18 & CustomerAge <= 60 ~ "19-60",
      CustomerAge > 60             ~ ">60"
    ))

df2 %>% 
  group_by(Area, Customer_Age_Group, .drop = FALSE) %>% 
  summarise(Totals = sum(CustomersInGroup)) %>%
  ungroup() %>% 
  complete(Area, Customer_Age_Group, fill=list(Totals=0)) %>% 
  arrange(Area, parse_number(Customer_Age_Group))
#> `summarise()` has grouped output by 'Area'. You can override using the
#> `.groups` argument.
#> # A tibble: 6 × 3
#>   Area  Customer_Age_Group Totals
#>   <chr> <chr>               <int>
#> 1 A     0-18                    0
#> 2 A     19-60                   8
#> 3 A     >60                     2
#> 4 B     0-18                    5
#> 5 B     19-60                   0
#> 6 B     >60                     8

^{Created on 2022-10-09 with reprex v2.0.2}