Search code examples
rdataframerows

Searching for duplicate row entries (with different values!) and adding them up


I have a Dataframe with 3 column entries in R. Number, Color, Quantity. I want to check my dataframe for rows that match their number and color, (Quantities may differ!). These rows need to be combined to one row with summed up quantities. An example table is:

Number Color Quantity
3475b Black 2
4349 White 4
970c00 Yellow 1
3475b Black 7
3641 Red 8
4349 Red 3

To sum it up: row 1 and 4 should be added up to a total quantity of 9. Row 2 and 6 have the same number but different color, so they stay apart. In this case the resulting table should have only 5 rows like:

Number Color Quantity
3475b Black 9
4349 White 4
970c00 Yellow 1
3641 Red 8
4349 Red 3

Solution

  • In base R, we can use aggregate

    aggregate(Quantity ~ ., df1, sum)
    

    -output

    #   Number  Color Quantity
    #1  3475b  Black        9
    #2   3641    Red        8
    #3   4349    Red        3
    #4   4349  White        4
    #5 970c00 Yellow        1
    

    data

    df1 <- structure(list(Number = c("3475b", "4349", "970c00", "3475b", 
    "3641", "4349"), Color = c("Black", "White", "Yellow", "Black", 
    "Red", "Red"), Quantity = c(2L, 4L, 1L, 7L, 8L, 3L)), class = "data.frame",
    row.names = c(NA, 
    -6L))