Search code examples
rweighted-average

Display weighted mean by group in the data.frame


Issues regarding the command by and weighted.mean already exist but none was able to help solving my problem. I am new to R and am more used to data mining language than programming.

I have a data frame with for each individual (observation/row) the income, education level and sample weight. I want to calculate the weighted mean of income by education level, and I want the result to be associated to each individual in a new column of my original data frame, like this:

obs income education weight incomegroup
1.   1000      A       10    --> display weighted mean of income for education level A
2.   2000      B        1    --> display weighted mean of income for education level B
3.   1500      B        5    --> display weighted mean of income for education level B
4.   2000      A        2    --> display weighted mean of income for education level A

I tried:

data$incomegroup=by(data$education, function(x) weighted.mean(data$income, data$weight))    

It does not work. The weighted mean is calculated somehow and appears in column "incomegroup" but for the whole set instead of by group or for one group only, I don't know. I read things regarding packages plyr or aggregate but it does not seem to do what I am interested in.

The ave{stats} command gives exactly what I am looking for but only for simple mean:

data$incomegroup=ave(data$income,data$education,FUN = mean)

It cannot be used with weights.

Thanking you in advance for your help!


Solution

  • If we use mutate, then we can avoid the left_join

    library(dplyr)
    df %>%
       group_by(education) %>% 
       mutate(weighted_income = weighted.mean(income, weight))
    #    obs income education weight weighted_income
    #  <int>  <int>    <fctr>  <int>           <dbl>
    #1     1   1000         A     10        1166.667
    #2     2   2000         B      1        1583.333
    #3     3   1500         B      5        1583.333
    #4     4   2000         A      2        1166.667