Search code examples
raggregatecumsum

Condensing groups of multiple rows into single row maintaining the values with the highest x in R?


I have a data frame that includes multiple rows of data for each time and would like to group by time to create a condensed data frame. Columns a and b are cumulative sums from other columns and should maintain the values from the row with the highest x for each time group rather than be sums or averages.

x   time  group  value  cumsum_A cumsum_B
1     0     A     0        0        0
2     0     B     0        0        0
3     0     A     0        0        0
4     1     A     0        0        0
5     1     B     1        0        1
6     1     B     0        0        1
7     2     B     1        0        2
8     2     A     1        1        2
9     2     A     1        2        2
10    2     A    -1        1        2
11    3     A     0        1        2
12    3     B     1        1        3

The ideal result would be the following:

x   time  group  value  cumsum_A cumsum_B
3     0     A     0        0        0
6     1     B     0        0        1
10    2     A    -1        1        2
12    3     B     1        1        3

Solution

  • An option would be to group by 'time', 'group' and slice the rows where the value of 'x is max (which.max)

    library(dplyr)
    df1 %>%
       group_by(time, group) %>%
       slice(which.max(x))