I have inherited a piece of SQL that I am working on for a reporting system. The system is centered around Purchase Orders, and when they are Created, Transmitted, Acknowledged, and Confirmed. These states are a progressive state. The piece of SQL that I am reviewing is shown below and is from the WHERE clause:
OR (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours)
OR (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours)
OR (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours)
OR (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours)
What I have found is that it is possible for a CreateJob to not have an end time because the job failed. However a job can execute more then one Purchase Order, so there are times when a given PO actually was created, but the job does not receive an end time because a sibling later failed. This creates a scenario where the states still progress for a PO, but it still shows up on this problem report because the CreateJob.endtime is NULL.
So first there are some obvious bugs with the system and architecture above, but that is a separate issue from what I am tackling at the moment.
It seems that I could fix the report though, by basically adding the progressive checks as AND statements. So the first check for the CreateJob changes too:
( (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours)
AND (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours)
AND (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours)
AND (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours))
Which is ugly and begins to obfuscate things into oblivion. Is there some way I can do the equivalent of the following in SQL? Basically some kind of macro or #define system?
CreateFailed = (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours)
TransmitFailed = (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours)
AcknowledgeFailed = (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours)
ConfirmFailed = (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours)
OR (CreateFailed AND TransmitFailed AND AcknowledgeFailed AND ConfirmFailed)
OR (TransmitFailed AND AcknowledgeFailed AND ConfirmFailed)
OR (AcknowledgeFailed AND ConfirmFailed)
OR (ConfirmFailed)
If there is a common term or name for what I am trying to do maybe someone could add it to the title?
[is] there is a common term or name for what I am trying to do...?
Abstraction?
You could hide each of your CreateFailed = ..., TransmitFailed = ...
, etc logic into its own VIEW
or CTE, then your main query could merely see if a Job exists in one of these tables e.g. (pseudo SQL):
SELECT Jobs.job_ID, ...
FROM Jobs
WHERE EXISTS (
SELECT *
FROM CreatedFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM TransmitFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM AcknowledgeFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM ConfirmFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
);
Here's the CTE idea expanded (pseudo SQL):
WITH CreateFailedJobs (job_ID, ...)
AS
(
SELECT Jobs.job_ID, ...
FROM ...
WHERE CreateJob.endtime is NULL
AND DATEDIFF(hour, .SomeTable.buy_date, getdate())
> Vendor.expected_create_hours
),
TransmitFailedJobs (job_ID, ...)
AS
(
SELECT Jobs.job_ID, ...
FROM ...
WHERE TransmitJob.endtime is NULL
AND DATEDIFF(hour, CreateJob.endtime, getdate())
> Vendor.expected_transmit_hours
),
AcknowledgeFailedJobs (job_ID, ...)
AS
(
SELECT Jobs.job_ID, ...
FROM ...
WHERE AcknowledgeJob.endtime is NULL
AND DATEDIFF(hour, TransmitJob.endtime, getdate())
> Vendor.expected_acknowledge_hours
),
ConfirmFailedJobs (job_ID, ...)
AS
(
SELECT Jobs.job_ID, ...
FROM ...
WHERE ConfirmJob.endtime is NULL
AND DATEDIFF(hour, AcknowledgeJob.endtime, getdate())
> Vendor.expected_confirm_hours)
)
SELECT Jobs.job_ID, ...
FROM Jobs
WHERE EXISTS (
SELECT *
FROM CreatedFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM TransmitFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM AcknowledgeFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
)
OR EXISTS (
SELECT *
FROM ConfirmFailedJobs AS T1
WHERE Jobs.job_ID = T1.job_ID
);
I'm not sure that this would be good for performance but readability is the aim and anyhow this report can be run offline.