Search code examples
rgroup-bymedianradix

Median by group in R


I have the following data frame and would like to introduce a dummy if a value is above the group's median.

df<-data.frame(group=rep(c("A","B","c"),3), value1=c(1:9))
m<-aggregate(. ~ group, data=df, FUN=median)
names(m)[2]<-"median"
df<-merge(df,m, by="group", all.x = T)
df$median_0_1<-ifelse(df$median<df$value1,1,0)

Is there a more elegant way to do this?

And, can i adjust this to set the dummy above or below third quartile?

And, is this a robust way, that will work reliably?

Thanks a lot.


Solution

  • Elegance lies in the eye of the beholder, but how do you like this.

    df <- within(df, {
      median <- ave(value1, group, FUN=median)
      median_0_1 <- ifelse(median < value1, 1, 0)
      quantile3 <- ave(value1, group, FUN=function(x) quantile(x, probs=.3))
      quantile_0_1 <- ifelse(quantile3 < value1, 1, 0)
    })
    df
    #   group value1 quantile_0_1 quantile3 median_0_1 median
    # 1     A      1            0       2.8          0      4
    # 2     B      2            0       3.8          0      5
    # 3     c      3            0       4.8          0      6
    # 4     A      4            1       2.8          0      4
    # 5     B      5            1       3.8          0      5
    # 6     c      6            1       4.8          0      6
    # 7     A      7            1       2.8          1      4
    # 8     B      8            1       3.8          1      5
    # 9     c      9            1       4.8          1      6