I have two tables:
TableA
ID Gender BeginDate
034446 F 2016-01-15T00:00:00
034446 F 2020-02-17T00:00:00
035689 F 2016-01-14T00:00:00
035679 F 2016-01-18T00:00:00
045687 F 2020-05-21T00:00:00
TableB
ID Gender Date
34446 F 2016-01-14
35689 F 2016-01-14
35679 F 2016-01-18
I'm trying to figure out how many records(and which ones)from TableB do not match up with Table A as well as how many do. Both tables have duplicate ID's which is why I need to also use the Date field to match up records. The date field from TableB might be a day or two off between tables. For example, the first row of Table B should match with the first row of Table A, not the second row. There needs to be a statement saying TableA date field is either equal to or between a day or two from the date field of TableB. I have attempted to write a query below but have the dates set equal to eachother.
SELECT a.ID, CONCAT('0',CAST(b.ID AS STRING)), EXTRACT(DATE FROM a.BeginDate) AS date
FROM `dev.tableA` a
LEFT OUTER JOIN `dev.TableB` b
ON a.ID = CONCAT('0',CAST(b.ID AS STRING))
AND EXTRACT(DATE FROM a.BeginDate) = b.Date
If I follow you correctly, you can use exists
:
select b.*
from tableb b
where not exists (
select 1
from tablea a
where a.id = b.id
and a.begindate >= date_sub(b.date, interval 2 day)
and a.begindate <= date_add(b.date, interval 2 day)
)
This brings records of b
for which no match exists in a
with the same id
and a date that falls within +/- two days. You can add the casting on id
, if that's really necessary, and adjust the boundaries as needed.
If you want an overall record count, you can just replace b.*
with count(*)
.