I have ran into this issue and I really have no clue how to do it. I have two data.frames, both with date columns. However, the first one, which is a big object, contains measurements each 3 seconds, while the second contains measurements each 10 minutes. I want to include the measurement variable of object 2 into object 1 (something like a left_join or merge) by the date variable. My data looks like this (df1):
date_time | measurement1 |
---|---|
yyyy-mm-dd HH:MM:03 | val1 |
yyyy-mm-dd HH:MM:06 | val2 |
df2:
date_time | measurement2 |
---|---|
yyyy-mm-dd HH:10:00 | val1 |
yyyy-mm-dd HH:20:00 | val2 |
I hope that is enough info, otherwise please comment. I have explored foverlapse and fuzzyjoin but without success.
Thank you in advance
Here is what I have in a bit more detail (df1):
date_time | measurement1 |
---|---|
05/06/2018 0:00:03 | 73 |
05/06/2018 0:00:06 | 73.5 |
05/06/2018 0:00:09 | 48.5 |
05/06/2018 0:00:12 | 50.7 |
05/06/2018 0:00:15 | 80 |
05/06/2018 0:00:18 | 81 |
Data continue for a number of months every time each 3 seconds
df2:
date_time | measurement2 |
---|---|
05/06/2018 0:00:00 | 110 |
05/06/2018 0:10:00 | 120 |
05/06/2018 0:20:00 | 180 |
What I want is this:
df:
date_time | measurement1 | measurement2 |
---|---|---|
05/06/2018 0:00:03 | 73 | 110 |
05/06/2018 0:00:06 | 73.5 | 110 |
05/06/2018 0:00:09 | 48.5 | 110 |
05/06/2018 0:00:12 | 50.7 | 110 |
05/06/2018 0:00:15 | 80 | 110 |
05/06/2018 0:00:18 | 81 | 110 |
I hope now is clearer, by the way, there might be an issue with tables, I am using the format I am told by Stack overflow and I can see the tables being produced in the review, but then the format is lost when I submit.
Thank you
Every minute has 20 observations if those observations occur every 3 seconds. Hence, there are 200 observations for every 10 minute interval. If your data is complete, then it would suffice that you stretch out your seconds 10-minute-interval observations accordingly, i.e. you copy every 10-minute-interval value 200 times next to the 3-second-interval values.
Try the following and tell me what you get
df1$measurement2 <- rep(df2$measurement2, each = 200)