I want to take the weighted mean of a group using the aggregate function in r.
Here is what my data looks like:
set.seed(1980)
Group_1 <- sample(letters[1:4], 50, TRUE)
Group_2 <- sample(letters[8:13], 50, TRUE)
Weight <- sample(seq(1,50), 50, TRUE)
Value <- sample(seq(1,50), 50, TRUE)
DF <- data.frame(Group_1, Group_2, Weight, Value)
head(DF)
I want to take the weighted mean of the Value
column, using the Weight
column, for each pairwise group.
Such that, the aggregate function would look like:
aggregate(Value ~ Group_1 + Group_2, data = df, mean)
How would I take the weighted mean using the aggregate function?
Instead of mean
, use weighted.mean
. However, aggregate
, may not be an option here because aggregate
loop over only the 'Value' column and it doesn't have access to the 'Weight' for each group
library(dplyr)
DF %>%
group_by(Group_1, Group_2) %>%
summarise(wt_mean = weighted.mean(Value, Weight), .groups = 'drop')
-output
# A tibble: 21 x 3
# Groups: Group_1 [4]
# Group_1 Group_2 wt_mean
# <chr> <chr> <dbl>
# 1 a h 24.7
# 2 a i 15
# 3 a j 21.1
# 4 a k 23.6
# 5 a m 14.1
# 6 b i 40
# 7 b j 12.7
# 8 b k 6.88
# 9 b l 30.6
10 b m 5
# … with 11 more rows
If we want to use base R
, then by
should work
by(DF, DF[c('Group_1', 'Group_2')], function(x) weighted.mean(x$Value, x$Weight))