Search code examples
rdatemergeintervalslubridate

Join data objects by date but with different intervals


I have ran into this issue and I really have no clue how to do it. I have two data.frames, both with date columns. However, the first one, which is a big object, contains measurements each 3 seconds, while the second contains measurements each 10 minutes. I want to include the measurement variable of object 2 into object 1 (something like a left_join or merge) by the date variable. My data looks like this (df1):

date_time measurement1
yyyy-mm-dd HH:MM:03 val1
yyyy-mm-dd HH:MM:06 val2

df2:

date_time measurement2
yyyy-mm-dd HH:10:00 val1
yyyy-mm-dd HH:20:00 val2

I hope that is enough info, otherwise please comment. I have explored foverlapse and fuzzyjoin but without success.

Thank you in advance

Here is what I have in a bit more detail (df1):

date_time measurement1
05/06/2018 0:00:03 73
05/06/2018 0:00:06 73.5
05/06/2018 0:00:09 48.5
05/06/2018 0:00:12 50.7
05/06/2018 0:00:15 80
05/06/2018 0:00:18 81

Data continue for a number of months every time each 3 seconds

df2:

date_time measurement2
05/06/2018 0:00:00 110
05/06/2018 0:10:00 120
05/06/2018 0:20:00 180

What I want is this:

df:

date_time measurement1 measurement2
05/06/2018 0:00:03 73 110
05/06/2018 0:00:06 73.5 110
05/06/2018 0:00:09 48.5 110
05/06/2018 0:00:12 50.7 110
05/06/2018 0:00:15 80 110
05/06/2018 0:00:18 81 110

I hope now is clearer, by the way, there might be an issue with tables, I am using the format I am told by Stack overflow and I can see the tables being produced in the review, but then the format is lost when I submit.

Thank you


Solution

  • Every minute has 20 observations if those observations occur every 3 seconds. Hence, there are 200 observations for every 10 minute interval. If your data is complete, then it would suffice that you stretch out your seconds 10-minute-interval observations accordingly, i.e. you copy every 10-minute-interval value 200 times next to the 3-second-interval values.

    Try the following and tell me what you get

    df1$measurement2 <- rep(df2$measurement2, each = 200)