I have a dataframe that looks like:
Group1 | Group2 | Score1 | Score2
-------|--------|--------|--------
A | 1 | 10 | 11
A | 2 | 13 | 14
B | 1 | 4 | 7
B | 2 | 9 | 12
A | 1 | 5 | 11
B | 2 | 9 | 13
A | 2 | 7 | 9
B | 1 | 7 | 10
What I need is to create a new variable, which gives me the median value for each subgroup combination, something like
Group1 | Group2 | Score1 | Score2 | Median
-------|--------|--------|--------|-------
A | 1 | 10 | 11 | 10.5
A | 2 | 13 | 14 | 11.5
B | 1 | 4 | 7 | 7.0
B | 2 | 9 | 12 | 10.5
A | 1 | 5 | 11 | 10.5
B | 2 | 9 | 13 | 10.5
A | 2 | 7 | 9 | 11.5
B | 1 | 7 | 10 | 7.0
(My median computation by hand could be wrong, but I think the point has been made).
I tried using the aggregate function like:
MedianAggregate <- aggregate(df[,45:47],
by = list(df$provider, df$field),
FUN = median)
What I get, instead of what I want, is something that looks like:
Group1 | Group2 | Score1 | Score2
-------|--------|--------|--------
A | 1 | 7.5 | 11
A | 2 | 10 | 11.5
B | 1 | 5.5 | 8.5
B | 2 | 9 | 12.5
basically what I need is to get the same output I would get if I would apply the median function to three rows in a matrix, rather than a dataframe, for each combination of subgroup.
Any help would be greatly appreciated.
Here's a possible solution, if I understood correctly:
library(reshape2)
library(dplyr)
df2 <- melt(df, id.vars = c("Group1","Group2"))
df2 %>% group_by(Group1, Group2) %>% summarise(median = median(value)) %>% merge(df, by = c("Group1","Group2"))