Search code examples
rtidyverse

How to sum two unequal length vector by Date in r?


The lengths of two datasets are unequal but they have the same variables. I want to sum the "value" variables of these two datasets by "Date".

Dataset 1:

Date value
1/1/2000 1
2/1/2000 1
3/1/2000 2
4/1/2000 3
5/1/2000 4
6/1/2000 5
7/1/2000 2

Dataset 2:

Date value
2/1/2000 5
3/1/2000 7
5/1/2000 2
7/1/2000 9

Expected outcome:

Date value
1/1/2000 1
2/1/2000 6
3/1/2000 9
4/1/2000 3
5/1/2000 6
6/1/2000 5
7/1/2000 11

Update - My solution

Personally, I prefer to rbind two data frames and use group_by to summarise the sum of the value.

The code is as below:

#The two datasets are named 'a' and 'b' respectively.

library(tidyverse)

a = do.call(rbind,list(a,b))

a %>% group_by(Date) %>%
      summarise(value = sum(value))


Solution

  • The safest option would be a powerjoin:

    library(powerjoin)
    power_inner_join(
      df1, df2, 
      by = "Date", 
      conflict = sum
    )
    

    But here, a simple match should suffice as well:

    df1$value <- df1$value + df2$value[match(df1$Date, df2$Date)]