I want to sum rows based on two different categorical columns. my data is like this :
df <- data.frame(country = c('US', 'US', 'US', 'UK', 'UK', 'UK', 'UK'),
team = c('A1', 'A2', 'B1', 'B1', 'B2', 'C1', 'C2'),
x1990 = c(4, 7, 8, 8, 8, 9, 12),
x2005 = c(3, 3, 4, 4, 6, 7, 7))
and I want to prepare it like this:
df_sum <- data.frame(country = c('US', 'US', 'UK','UK'),
team = c('A', 'B','B', 'C'),
x1990 = c(11, 8,16,21),
x2005 = c(6, 4,10, 14))
You may try using dplyr
like
library(dplyr)
df %>%
mutate(team = sub("^([[:alpha:]]*).*", "\\1", team)) %>%
group_by(country, team) %>%
summarize(across(where(is.numeric), sum)) %>%
arrange(desc(country))
country team x1990 x2005
<chr> <chr> <dbl> <dbl>
1 US A 11 6
2 US B 8 4
3 UK B 16 10
4 UK C 21 14