Search code examples
rdataframematrixsumaggregate

How to sum values of columns if they have some row values in common in R?


I have a very big dataframe. let df below represent it:

df <-as.data.frame(rbind(c("a",1,1,1),c("a",1,1,1),c("a",1,1,1),c("b",2,2,2),c("b",2,2,2),c("b",2,2,2)))
     [,1] [,2] [,3] [,4]
[1,] "a"  "1"  "1"  "1" 
[2,] "a"  "1"  "1"  "1" 
[3,] "a"  "1"  "1"  "1" 
[4,] "b"  "2"  "2"  "2" 
[5,] "b"  "2"  "2"  "2" 
[6,] "b"  "2"  "2"  "2"

I want to create a dataframe like the one below out of it:

     [,1] [,2] [,3] [,4]
 [1,] "a"  "3"  "3"  "3" 
 [2,] "b"  "6"  "6"  "6"

I see several similar posts here, but the answers although very useful need a vector pf all possible values in the first column and so on. my problem is my dataset has about 3000 rows.

How can I get the result in r?


Solution

  • We could use group_byand summariseafter using type.convert(as.is=TRUE):

    library(dplyr)
    df %>% 
        type.convert(as.is=TRUE) %>% 
        group_by(V1) %>% 
        summarise(across(V2:V4, sum))
    
    
      V1       V2    V3    V4
      <chr> <int> <int> <int>
    1 a         3     3     3
    2 b         6     6     6