Problem: need to add values from one dataframe to another depending on the time window in which each row occurs.
I have one dataframe with a list of singular events like this:
Ind Date Time Event
1 FAU 15/11/2016 06:40:43 A
2 POR 15/11/2016 12:26:51 V
3 POR 15/11/2016 14:52:53 B
4 MAM 20/11/2016 08:12:19 G
5 SUR 03/12/2016 13:51:18 A
6 SUR 14/12/2016 07:47:06 V
And a second data frame with ongoing, continuous events linked like this:
Date Time Event
1 15/11/2016 06:56:48 1
2 15/11/2016 06:59:40 2
3 15/11/2016 07:27:36 3
4 15/11/2016 07:29:10 4
5 15/11/2016 07:34:51 5
6 15/11/2016 07:35:10 6
7 15/11/2016 07:37:19 7
8 15/11/2016 07:39:55 8
9 15/11/2016 07:51:59 9
10 15/11/2016 08:00:13 10
11 15/11/2016 08:08:01 11
12 15/11/2016 08:13:21 12
13 15/11/2016 08:16:21 13
14 15/11/2016 12:14:48 14
15 15/11/2016 12:16:58 15
16 15/11/2016 12:51:22 16
17 15/11/2016 12:52:09 17
18 15/11/2016 13:26:29 18
19 15/11/2016 13:26:55 19
20 15/11/2016 13:34:14 20
21 15/11/2016 13:50:41 21
22 15/11/2016 13:53:25 22
23 15/11/2016 14:15:17 23
24 15/11/2016 14:54:49 24
Question: how can I combine these so that for the singular events we can see during which continuous events they occurred, for example, something like this:
Ind Date Time Eventx Eventy
1 FAU 15/11/2017 06:40:43 A 1
2 POR 15/11/2017 12:26:51 V 15
3 POR 15/11/2017 14:52:53 B 23
Many thanks
I can provide you with a data.table
solution. The only issue is that I had to move the start of the first event in the second dataframe to an earlier date, since it was after the starting time of the first event of the first dataframe.
You'll need the additional packages data.table
and lubridate
.
library(data.table)
library(lubridate)
dt1 <- data.table(df1)
dt2 <- data.table(df2)
dt1[, Date.Time := as.POSIXct(strptime(paste(Date, Time, sep = " "), "%d/%m/%Y %H:%M:%S"))]
dt2[, Date.Time := as.POSIXct(strptime(paste(Date, Time, sep = " "), "%d/%m/%Y %H:%M:%S"))]
# Create the start and end time columns in the second data.table
dt2[, `:=`(Start.Time = Date.Time
, End.Time = shift(Date.Time, n = 1L, fill = NA, type = "lead"))]
# Change the start date to an earlier one
dt2[Event == 1,`:=`(Start.Time = Start.Time - days(1)) ]
# Merge on multiple conditions and the selection of the relevant columns
dt2[dt1, on=.(Start.Time < Date.Time
, End.Time > Date.Time)
, nomatch = 0L][,.(Ind
, Date
, Time
, Eventx = i.Event
, Eventy = Event)]
# Output of the last merge
Ind Date Time Eventx Eventy
1: FAU 15/11/2016 06:56:48 A 1
2: POR 15/11/2016 12:16:58 V 15
3: POR 15/11/2016 14:15:17 B 23