Search code examples
rdataframedplyraggregate

How to aggregate different columns with different functions


I have a dataset that looks like this

df

ID size product   x    y 
A   1     abc    0.3   5
B   1     abc    0.8   7
C   1     abc    0.5   2
D   3     def    0.6   1

And I want to aggregate it x with mean and y with sum

so the code for aggregate both by sum is like this

df1<-aggregate(list(x=df$x, y=df$y), by=list(df$size), FUN="sum")

How can I change that code to have a dataset like this one:

df2

size     x     y 
 1      0.53   14
 3      0.6    1

Thanks in advance


Solution

  • I would use the summarise() function from the tidyverse set of packages when dealing with data frames of this nature. This allows you to summarise over a group with multiple different equations. I have demonstrated what I'd expect the solution to look like below with this solution.

    
    df %>%
      group_by(size) %>%
      summarise(
        x = mean(x),
        y = sum (y)
      )
    
    

    A base r solution would require a different approach.