I'm trying to join two tables that have check date in both. The results should show only if the check date in the first table is equal to the check date in the second table. OR if the check date in both tables is within 10 days of each other.
I have two tables:
t1 table: company_id, expected_checkdate
t2 table: company_id, actual_checkdate
Condtion 4: if acutal_checkdate = expected_checkdate then don't check for others that are within 10 days
LEFT OUTER JOIN t2 ON t1.actual_checkdate = t2.expected_checkdate
OR t1.actual_checkdate
BETWEEN DATE_ADD(t2.expected_checkdate, INTERVAL -10 DAY)
AND DATE_ADD(t2.expected_checkdate, INTERVAL 10 DAY)
AND t1.company_id = t2.company_id
The problem is when I run this for one month. I'm seeing a lot of duplicates because there can be 2 actual_checkdate entries or 2 expected_checkdate in one month.
|---------------------|------------------|------------------|
| company_id | actual_checkdate |expected_checkdate|
|---------------------|------------------|------------------|
| 12 | 2018-01-05 | 2018-01-05 |
|---------------------|------------------|------------------|
| 12 | 2018-01-19 | 2018-01-19 |
|---------------------|------------------|------------------|
| 12 | 2018-01-05 | 2018-01-19 | -- incorrect
|---------------------|------------------|------------------|
| 12 | 2018-01-19 | 2018-01-05 | -- incorrect
|---------------------|------------------|------------------|
| 13 | 2018-01-12 | 2018-01-20 |
|---------------------|------------------|------------------|
| 14 | 2018-01-26 | 2018-01-36 |
|---------------------|------------------|------------------|
The first two and last two rows are correct. The third and fourth rows shouldn't display because they are duplicates of the first and second row. Please help me with the join above.
You could rewrite your JOIN
condition as:
FROM t1
JOIN t2 ON t1.company_id = t2.company_id AND
(t1.actual_checkdate = t2.expected_checkdate OR
t1.actual_checkdate BETWEEN
DATE_SUB(t2.expected_checkdate, INTERVAL 10 DAY)
AND DATE_ADD(t2.expected_checkdate, INTERVAL 10 DAY)
AND NOT EXISTS (SELECT * FROM t1 WHERE t1.actual_checkdate = t2.expected_checkdate AND t1.company_id = t2.company_id)
)
The NOT EXISTS clause will prevent matching on a nearby date if an actual match exists.