I have two tables Response
and Distributions
:
Table structure with expected output
In this case, there are multiple responses but 1 distribution. We need to tie just one latest response before the assignment date to the distribution, basically
MAX(COALESCE(RESPONSE_DATE, CREATED_DATE)) <= ASSIGNMENT_DATE
The SQL query I tried:
SELECT
resp.CONTACT_ID, resp.RESPONSE_ID, resp.RESPONSE_DATE,
resp.CREATED_DATE, d.ASSIGNMENT_DATE AS DISTRIBUTION_DATE
FROM
Response resp
LEFT JOIN
Distribution d ON resp.CONTACT_ID = d.CONTACT_ID
-- 12 Hour Grace Period For assignments created before response
AND DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= d.ASSIGNMENT_DATE
This query returns DISTRIBUTION_DATE
as 2020-10-28 for first two rows when.
This condition
DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= ASSIGNMENT_DATE
is satisfied (see SQL query returning wrong output table in the attached screenshot).
However, I want DISTRIBUTION_DATE
as "2020-10-28" only for the second row as shown in the expected output. The reason is the latest response just before the assignment date will get distributed and I don't care about initial X responses (We should tie one distribution to only one latest response)
I tried to use
MAX(DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE))) <= ASSIGNMENT_DATE
in the JOIN ON
condition but that doesn't work in SQL.
Please let me know how to structure the query and get the expected output.
NOTE: The join from Response
to Distribution
has to be on CONTACT_ID
, there is no explicit JOIN_KEY
and it can lead to 1:M joins as we have same CONTACT_ID
(that's the reason a lot of filtering is done in join ON condition), ideal scenario is to have RESPONSE_ID
on Distribution
table as well but that's not how the data is structured.
TIA
If I understand what you require, you'll need to run a query like this to get the expected results:
SELECT CONTACT_ID, RESPONSE_ID, RESPONSE_DATE, CREATED_DATE, CASE WHEN RANKING = 1 THEN DISTRIBUTION_DATE ELSE NULL END AS DISTRIBUTION_DATE
FROM (
SELECT
resp.CONTACT_ID, resp.RESPONSE_ID, resp.RESPONSE_DATE,
resp.CREATED_DATE, d.ASSIGNMENT_DATE AS DISTRIBUTION_DATE,
DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) AS GRACE_DATE,
RANK() OVER (
PARTITION BY resp.CONTACT_ID
ORDER BY
CASE
WHEN DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE)) <= d.ASSIGNMENT_DATE
THEN DATEADD(hour, -12, COALESCE(resp.RESPONSE_DATE, resp.CREATED_DATE))
ELSE
'19000101'
END
DESC ) AS RANKING
FROM
Response resp
LEFT JOIN
[Distribution] d ON resp.CONTACT_ID = d.CONTACT_ID
) DerivedTable
ORDER BY CONTACT_ID, RESPONSE_ID
Basically, I use a rank to workout the closest date based on the assignment date, and then know that the ranking where it is 1 should be the one to show the distribution date.