I have the following table (a tibble
)
Year | Name | n |
---|---|---|
1969 | Alice | 24 |
1969 | Bob | 25 |
1969 | Charlie | 33 |
1970 | Alice | 13 |
1970 | Bob | 20 |
1970 | Charlie | 10 |
1971 | Alice | 23 |
1971 | Bob | 10 |
1971 | Charlie | 16 |
etc.
I want to combine "Bob" and "Charlie" into a single row for each year by summing their value for n
and give them a new name "Derek" (it makes more sense in my actual data). So I want
Year | Name | n |
---|---|---|
1969 | Alice | 24 |
1969 | Derek | 58 |
1970 | Alice | 13 |
1970 | Derek | 30 |
1971 | Alice | 23 |
1971 | Derek | 26 |
I have tried figuring out how to do it with aggregate()
and case_when()
, but I can't seem to get it to work.
Any help welcome
We can first replace the names with Derek. Then summarise
to sum
the ns from the doubles, grouped .by
Name and Year.
library(dplyr)
df |>
mutate(Name = case_match(Name, c("Bob", "Charlie") ~ "Derek",
.default = Name)) |>
summarise(n = sum(n), .by = c(Year, Name))
Year Name n
1 1969 Alice 24
2 1969 Derek 58
3 1970 Alice 13
4 1970 Derek 30
5 1971 Alice 23
6 1971 Derek 26