I have a dataframe that contains (among other things) three columns that have missing values every 5 rows. These missing values need to be replaced with the sum of the previous 4 values in their respective column.
For example, let's say my dataframe looked like this:
id category1 category2 category3
123 5 10 10
123 6 11 15
123 6 12 23
123 4 10 6
123 NA NA NA
567 24 17 15
Those NAs need to represent a "total" based on the sum of the previous 4 values in their column, and this needs to repeat throughout the entire dataframe because the NAs occur every 5 rows. For instance, the three NAs in the mock example above should be replaced with 21, 43, and 54. 5 rows later, the same process will need to be repeated. How can I achieve this?
Another possible solution:
library(dplyr)
df %>%
group_by(id) %>%
mutate(across(everything(), ~ if_else(is.na(.x), sum(.x, na.rm = T), .x))) %>%
ungroup
#> # A tibble: 6 × 4
#> id category1 category2 category3
#> <int> <int> <int> <int>
#> 1 123 5 10 10
#> 2 123 6 11 15
#> 3 123 6 12 23
#> 4 123 4 10 6
#> 5 123 21 43 54
#> 6 567 24 17 15