I have two datasets:
dataset_a
time_stamp user group value
2021-06-20 12:48:24.521 A video 1
2021-06-15 12:50:24.521 A video 1
2021-06-10 12:48:24.521 A video 1
dataset_b
time_stamp user group label
2021-06-20 09:40:24.521 A video BA
2021-06-19 13:30:24.521 A video BB
2021-06-13 12:48:24.521 A video BC
2021-06-09 12:55:24.521 A video BD
I want to create a dataset where if dataset b is within 1 day of timestamp of dataset a by timestamp, user, and group then it is a match. Has anyone done something like this before where it is something like left join on dataset_b.timestamp between dataset_a.timestamp and date_add(dataset_a.timestamp,-1)
. I'd like to have the flexibility where in the future I can test out -7 days as well so it is easily modifiable.
Expected output below:
dataset_a
time_stamp user group value timestamp_b label
2021-06-20 12:48:24.521 A video 0.5 2021-06-20 09:40:24.521 BA
2021-06-20 12:48:24.521 A video 0.5 2021-06-19 13:30:24.521 BB
2021-06-15 12:50:24.521 A video 1 NULL NULL
2021-06-10 12:48:24.521 A video 1 2021-06-09 12:55:24.521 BD
The JOIN
condition does not have to be only equality operator so:
SELECT *
FROM dataset_a
LEFT JOIN dataset_b
ON dataset_b.user = dataset_a.user
AND dataset_b.group = dataset_a.group
AND dataset_b.time_stamp BETWEEN dataset_a.time_stamp - INTERVAL '1 day'
AND dataset_a.time_stamp ;
is a valid join.