So this is my data frame. Country1 represent the people that live in Germany and Country 2 represent the country that they used to live 5 years before moving to Country1 .
Country1 | Country2 | Weight | obs |
---|---|---|---|
Germany | Germany | 4 | 1 |
Germany | Germany | 119 | 2 |
France | Germany | 3 | 3 |
France | Germany | 2 | 4 |
Italy | France | 1 | 5 |
Basically what I want is to make a summary of the columns weights for each combination and the multiply by the observation (represented by the column obs. For example, in the first row I have the combination Germany to Germany so what I want is to sum the weights of the column Weight (119+4=123) and then multiply the result of this sum (123* 1=123) to the respective observation of the column Obs (1) (in the first row). For the second row would be the same the summary of the weight for Germany would be (119+4=123)and this result have to be multiplied by the observation of this row in this case (123* 2=246). In the third row the sum of weights would be (3+2=5) and then multiply this result by the observations for this row (5* 3=15) and so on.
The output that I want is represented by the column x and it would be something like this.
Country1 | Country2 | Weight | obs | x |
---|---|---|---|---|
Germany | Germany | 4 | 1 | 123 |
Germany | Germany | 119 | 2 | 246 |
France | Germany | 3 | 3 | 15 |
France | Germany | 2 | 4 | 20 |
Italy | France | 1 | 5 | 5 |
Also the formula that im trying to apply is this one.
Try this:
library(dplyr)
#Code
new <- df %>% group_by(Country1) %>%
mutate(x=sum(Weight)*obs)
Output:
# A tibble: 5 x 5
# Groups: Country1 [3]
Country1 Country2 Weight obs x
<chr> <chr> <int> <int> <int>
1 Germany Germany 4 1 123
2 Germany Germany 119 2 246
3 France Germany 3 3 15
4 France Germany 2 4 20
5 Italy France 1 5 5
Some data used:
#Data
df <- structure(list(Country1 = c("Germany", "Germany", "France", "France",
"Italy"), Country2 = c("Germany", "Germany", "Germany", "Germany",
"France"), Weight = c(4L, 119L, 3L, 2L, 1L), obs = 1:5), class = "data.frame", row.names = c(NA,
-5L))