Search code examples
runify

Unify multiple rows in one row in R


I have a dataset like this:

a1 <- c("a","a","a", "b", "c", "c", "d", "d", "d", "d")
b1 <- c(7, 7, 7,5, 4, 4, 3, 3, 3, 3)
c1 <- c("A","B", "C", "D", "E", "F", "B", "C", "EE", "F")
m1 <- data.frame(a1, b1, c1)

my expected result is a dataset like this:

a <- c("a","b", "c", "d")
b <- c(7, 5, 4, 3)
c <- c("ABc","D", "EF", "BCEEF")
m <- data.frame(a, b, c)

I try this code but it doesn't work:

m1 <- m1 %>%
 group_by(a1)

how can I fix it?


Solution

  • I personally like @Maël answer best, but want to share this solution using toString(): It is more or less the same, but toString() is default separated with a comma. We override this here with post processing using gsub():

    library(dplyr) # > 1.1.0
    
    m1 |> 
      summarise(c = gsub(", ", "", toString(c1)), .by=c(a1,b1)) 
    
      a1 b1     c
    1  a  7   ABC
    2  b  5     D
    3  c  4    EF
    4  d  3 BCEEF