Search code examples
sqlranalytic-functions

Analog of SQL analytic functions for R


is there an analog to SQL analytic so that one could do aggregation without collapsing rows? For example, I want to do a sum for each group without GROUP BY, in SQL I can do this:

select group, x, sum(x) over(partition by group) group_sum
from mytable

I would like to be able to do something similar in R:

df <- data.frame(group=c('a', 'a', 'b', 'b'), x=c(1, 3, 10, 30))
df %>% mutate(group_sum = window_aggr(group_by=group, func=sum))

group x group_sum
a     1    4
a     3    4
b    10   10
b    30   40

where window_aggr is just a made-up function.

So is there a way to implement this in a single pipeline, without doing an actual aggregation and a join?

Thanks!

Best regards, Nikolai


Solution

  • You can use ave which will in this case calculate the sum of df$x for the groups df$group.

    df$group_sum <- ave(df$x, df$group, FUN=sum)
    
    df
    #  group  x group_sum
    #1     a  1         4
    #2     a  3         4
    #3     b 10        40
    #4     b 30        40
    

    Or using base pipes:

    df |> transform(group_sum = ave(x, group, FUN=sum))
    #  group  x group_sum
    #1     a  1         4
    #2     a  3         4
    #3     b 10        40
    #4     b 30        40
    

    Or using dplyr

    library(dplyr)
    df %>% mutate(group_sum = ave(x, group, FUN=sum))
    #  group  x group_sum
    #1     a  1         4
    #2     a  3         4
    #3     b 10        40
    #4     b 30        40