Search code examples
raggregategroupingmeansummary

Divide column of data by mean of the group


If I have a data frame, such as:

group=rep(1:4,each=10)
data=c(seq(1,10,1),seq(5,50,5),seq(20,11,-1),seq(0.3,3,0.3))
DF=data.frame(group,data)

Now, I would like to divide each data element by the mean of its group. For example:

group=rep(1:4,each=10)
data=c(seq(1,10,1),seq(5,50,5),seq(20,11,-1),seq(0.3,3,0.3))
DF=data.frame(group,data)
aggregate(DF,by=list(DF$group),FUN=mean)

#Group.1 group  data
#1       1     1  5.50
#2       2     2 27.50
#3       3     3 15.50
#4       4     4  1.65

data1=c(seq(1,10,1)/5.5,seq(5,50,5)/27.5,seq(20,11,-1)/15.5,seq(0.3,3,0.3)/1.65)
DF1=data.frame(group, data1)

However, this is a bit convoluted, and work not work easily in a large dataset. I feel like there is an apply application which could be used here, but I cannot find a nice way to do it.


Solution

  • Here's the usual set of options (thanks to @G.Grothendieck for simplification of ave):

    # base R 
    DF$newdata = ave(DF$data, DF$group, FUN = function(x) x/mean(x))
    # or...
    DF$newdata = DF$data / ave(DF$data, DF$group)
    
    # dplyr
    library(dplyr)
    DF %>% group_by(group) %>% mutate(newdata = data/mean(data))
    
    # data.table
    library(data.table)
    setDT(DF)[, newdata := data/mean(data), by=group]