Search code examples
rdplyrmutate

How to find the sum of two consecutive rows of a column from a dataframe based on a condition of another columns(using,maybe, dplyr's mutate function)


I have the following data frame (say df1) in R:

enter image description here

and would like to find the sum of the sizes for each of the "continent", "country", and "sex" combinations in cases where the category column values are A and B. I would then like to change the category value to D, and the desired new data frame (say df2) is given below:

enter image description here

As I am new to R programming, I would really appreciate it if anyone can help me in converting the df1 to df2 and many thanks in advance.

PS. The code for my original data frame with category A, B, and C is given below:

    df1 <- data.frame("continent" = c('Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Asia', 'Asia', 'Asia', 'Asia', 'Asia', 'Asia'), 
"country" = c('Germany', 'Germany', 'Germany', 'Germany', 'Germany','Germany','Japan', 'Japan', 'Japan', 'Japan', 'Japan','Japan'), 
"sex" = c('male', 'male', 'male', 'female', 'female', 'female', 'male', 'male', 'male', 'female', 'female', 'female'), 
"category" = c('A', 'B', 'C','A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'), "size" = c(20, 35, 40, 32, 14, 53,18,31, 16,65,13,22))

Solution

  • It seems like the consecutiveness doesn't matter. You want to combine categories A and B into a new category D and then do a grouped sum:

    df1 %>%
      mutate(category = ifelse(category %in% c("A", "B"), "D", category)) %>%
      group_by(continent, country, sex, category) %>%
      summarize(size = sum(size), .groups = "drop")
    # # A tibble: 8 × 5
    #   continent country sex    category  size
    #   <chr>     <chr>   <chr>  <chr>    <dbl>
    # 1 Asia      Japan   female C           22
    # 2 Asia      Japan   female D           78
    # 3 Asia      Japan   male   C           16
    # 4 Asia      Japan   male   D           49
    # 5 Europe    Germany female C           53
    # 6 Europe    Germany female D           46
    # 7 Europe    Germany male   C           40
    # 8 Europe    Germany male   D           55