Search code examples
rdata-manipulation

Data manipulation: reduce rows number (with R)


my dataset is like this:

a1 <- c("a","a","a", "b", "c", "c", "d", "d", "d", "d")
b1 <- c(7, 7, 7,5, 4, 4, 3, 3, 3, 3)
c1 <- c("A","B", "C", "D", "E", "F", "B", "C", "EE", "F")
m1 <- data.frame(a1, b1, c1)

and my expected result is like this:

a <- c("a","b", "c", "d")
b <- c(7, 5, 4, 3)
c <- c("A; B; C","D", "E; F", "B; C; EE; F")
m <- data.frame(a, b, c)

My actual code is this, but it doesn't work:

library(dplyr)
m2 <-m |> 
  summarise(c = gsub(", ", "", toString(c1)), .by=c(a1,b1)) 

how can I fix it? thanks


Solution

  • I think you can try

    > aggregate(c1 ~ ., m1, toString)
      a1 b1          c1
    1  d  3 B, C, EE, F
    2  c  4        E, F
    3  b  5           D
    4  a  7     A, B, C