Search code examples
rdataframedplyrsummarize

Sum rows with value larger than n into one in R


I have a data frame:

df <- data.frame(count=c(0,1,2,3,4,5,6), value=c(100,50,60,70,2,6,8))

  count value
1     0   100
2     1    50
3     2    60
4     3    70
5     4     2
6     5     6
7     6     8

How do I sum value larger than "n" into one row? So for example, if I choose n = 3 then I want to have:

  count value
1     0   100
2     1    50
3     2    60
4     3    70
5    >3    16

Solution

  • Here is a dplyr solution using replace. The downside is, it needs to be arranged if >3 should be the last line (otherwise it'd be pretty concise).

    library(dplyr)
    
    df %>% 
      group_by(count = replace(count, count > 3, ">3")) %>% 
      summarise(value = sum(value)) %>% 
      arrange(count == ">3")
    #> # A tibble: 5 x 2
    #>   count value
    #>   <chr> <dbl>
    #> 1 0       100
    #> 2 1        50
    #> 3 2        60
    #> 4 3        70
    #> 5 >3       16
    

    Created on 2021-08-26 by the reprex package (v0.3.0)