Search code examples
raggregateunique

Aggregate multiple rows based on common values


I have a dataset like this below

  W X Y Z
  A 2 3 4
  A 2 3 6
  B 1 2 3
  C 3 2 1
  B 1 3 4
  B 1 2 2

I am want to combine/collapse the values in column Z only if the values in column W, X, Y are similar.

The final dataset will be like this.

  W X Y Z
  A 2 3 4,6
  B 1 2 3,2
  C 3 2 1
  B 1 3 4

Not sure how to do this, any suggestions is much appreciated.


Solution

  • We can group by 'W', 'X', 'Y' and paste the values of 'Z' (toString is paste(..., collapse=", "))

    library(dplyr)
    df1 %>%
       group_by(W, X, Y) %>%
       summarise(Z = toString(unique(Z)))
    # A tibble: 4 x 4
    # Groups:   W, X [3]
    #  W         X     Y Z    
    #  <chr> <int> <int> <chr>
    #1 A         2     3 4, 6 
    #2 B         1     2 3, 2 
    #3 B         1     3 4    
    #4 C         3     2 1    
    

    Or with aggregate from base R

    aggregate(Z ~ ., unique(df1), toString)
    #  W X Y    Z
    #1 B 1 2 3, 2
    #2 C 3 2    1
    #3 B 1 3    4
    #4 A 2 3 4, 6
    

    data

    df1 <- structure(list(W = c("A", "A", "B", "C", "B", "B"), X = c(2L, 
    2L, 1L, 3L, 1L, 1L), Y = c(3L, 3L, 2L, 2L, 3L, 2L), Z = c(4L, 
    6L, 3L, 1L, 4L, 2L)), class = "data.frame", row.names = c(NA, 
    -6L))