The lengths of two datasets are unequal but they have the same variables. I want to sum the "value" variables of these two datasets by "Date".
Dataset 1:
Date | value |
---|---|
1/1/2000 | 1 |
2/1/2000 | 1 |
3/1/2000 | 2 |
4/1/2000 | 3 |
5/1/2000 | 4 |
6/1/2000 | 5 |
7/1/2000 | 2 |
Dataset 2:
Date | value |
---|---|
2/1/2000 | 5 |
3/1/2000 | 7 |
5/1/2000 | 2 |
7/1/2000 | 9 |
Expected outcome:
Date | value |
---|---|
1/1/2000 | 1 |
2/1/2000 | 6 |
3/1/2000 | 9 |
4/1/2000 | 3 |
5/1/2000 | 6 |
6/1/2000 | 5 |
7/1/2000 | 11 |
Update - My solution
Personally, I prefer to rbind
two data frames and use group_by
to summarise
the sum of the value
.
The code is as below:
#The two datasets are named 'a' and 'b' respectively.
library(tidyverse)
a = do.call(rbind,list(a,b))
a %>% group_by(Date) %>%
summarise(value = sum(value))
The safest option would be a powerjoin
:
library(powerjoin)
power_inner_join(
df1, df2,
by = "Date",
conflict = sum
)
But here, a simple match
should suffice as well:
df1$value <- df1$value + df2$value[match(df1$Date, df2$Date)]