Search code examples
rdataframesumaggregaterow

How to sum rows based on multiple conditions and replace it in the dataframe?


R beginner here in need of some help. I have this dataframe:

dat<-data.frame(Name=c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C"),
                Score=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5),
                Frequency=c(9,11,10,5,5,3,7,10,5,5,20,3,3,2,2))

enter image description here

And I want to sum the frequencies of rows with scores 2-3 and 4-5 by name, and rename the scores High (score 1), Medium (scores 2-3) or Low (scores 4-5). Basically my dataframe should look like this:

enter image description here

Is there a more straightforward way to do this? Thanks a lot!


Solution

  • Here is a base R approach.

    First, create Category based on the Score using cut:

    dat$Category <- cut(dat$Score, 
                        breaks = c(1, 2, 4, 5), 
                        labels = c("High", "Medium", "Low"), 
                        include.lowest = T, 
                        right = F)
    

    Then you can aggregate based on both Name and Category to get the final result:

    aggregate(Frequency ~ Name + Category, data = dat, sum)
    

    Output

      Name Category Frequency
    1    A     High         9
    2    B     High         3
    3    C     High        20
    4    A   Medium        21
    5    B   Medium        17
    6    C   Medium         6
    7    A      Low        10
    8    B      Low        10
    9    C      Low         4