Sorry for asking what might be a very basic question, but I am stuck in a conundrum and cannot seem to get out of it.
I have a code that looks like
Medicine Biology Business sex weights
0 1 0 1 0.5
0 0 1 0 1
1 0 0 1 05
0 1 0 0 0.33
0 0 1 0 0.33
1 0 0 1 1
0 1 0 0 0.33
0 0 1 1 1
1 0 0 1 1
Where the first three are fields of study, and the fouth variable regards gender. Obviously with many more observations. What I want to get, is the mean level of the the field of study (medicine, biology, business) by the variable sex (so the mean for men and the mean for women). To do so, I have used the following code:
barplot_sex<-aggregate(x=df_dummies[,1:19] , by=list(df$sex),
FUN= function(x) mean(x)
Which works perfectly and gives me what I needed. My problem is that I need to use a weighted mean now, but I canno use
FUN= function(x) weighted.mean(x, weights)
as there are many more observations than fields of study.
The only alternative I managed to do was to edit(boxplot) and change the values manually, but then R doesn't save the changes. Plus, I am sure there must be a trivial way to do exactly what I need.
Any help would be greatly appreciated.
Bests, Gabriele
Using by
.
by(dat, dat$sex, function(x) sapply(x[, 1:3], weighted.mean, x[, "weights"]))
# dat$sex: 0
# Medicine Biology Business
# 0.0000000 0.3316583 0.6683417
# ---------------------------------------------------------------------------------------
# dat$sex: 1
# Medicine Biology Business
# 0.82352941 0.05882353 0.11764706
Data:
dat <- structure(list(Medicine = c(0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L
), Biology = c(1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L), Business = c(0L,
1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L), sex = c(1L, 0L, 1L, 0L, 0L,
1L, 0L, 1L, 1L), weights = c(0.5, 1, 5, 0.33, 0.33, 1, 0.33,
1, 1)), class = "data.frame", row.names = c(NA, -9L))