I'm working with a population dataframe and I have information for different years and age groups divided by bins of five years. Once I filtered the information for the location I'm interested in I have this:
Location age group total90 total95 total00 total05 total10
A 0 to 4 10428 118902 76758 967938 205472
A 5 to 9 18530 238928 260331 277635 303180
A 10 to 14 180428 208902 226758 267938 305472
A 15 to 19 185003 332089 242267 261793 135472
Now what I want is to create new age groups to have something like this:
Location age group total90 total95 total00 total05 total10
A 5 to 14 198958 447830 487089 545573 608652
A other 195431 450991 319025 1229731 340944
where
age group "5 to 14" is the total of "5 to 9" + "10 to 14" for each year &
"other" is the total of "0 to 4" + "15 to 19" for each year
I tried selecting the columns with numbers so I could add the totals for each age group and create a row with the new age group but I can't add rows in an easy way and I'm complicating things more. I'm sure there is an easy way to solve this but I'm stuck.
I had to alter your dummy data slightly (juste removed some space to make reading from plain text easy)to make it work without further manipulation
df <- data.table::fread("Location age_group total90 total95 total00 total05 total10
A 0_to_4 10428 118902 76758 967938 205472
A 5_to_9 18530 238928 260331 277635 303180
A 10_to_14 180428 208902 226758 267938 305472
A 15_to_19 185003 332089 242267 261793 135472")
library(tidyverse)
df %>%
# alter the character variable age_group reducing problem to one ifelse clause
dplyr::mutate(age_group = ifelse(age_group == "5_to_9" | age_group == "10_to_14", "5_to_14", "other")) %>%
# build grouping (I included Location but possibly your need is diferent)
dplyr::group_by(Location, age_group) %>%
# sum in one call all not grouped columns (therefore you have to remove Location in case you do not want it in the grouping
dplyr::summarize(across(everything(), ~sum(.x))) %>%
# ungrouping prevents unwanted behaviour down stream
dplyr::ungroup()
# A tibble: 2 x 7
Location age_group total90 total95 total00 total05 total10
<chr> <chr> <int> <int> <int> <int> <int>
1 A 5_to_14 198958 447830 487089 545573 608652
2 A other 195431 450991 319025 1229731 340944