I need to query rows out of a single table based on a comparison of multiple columns and multiple dates and group the results by a specific column.
The table has this basic structure:
engine | read_date | cyl1_read | cyl2_read | cyl1_last_maint_date | cyl2_last_maint_date |
---|---|---|---|---|---|
1 | 06/31/2023 | 25 | 21 | 07/15/2023 | 05/25/2023 |
1 | 11/01/2023 | 27 | 19 | 07/15/2023 | 05/25/2023 |
2 | 04/11/2023 | 37 | 18 | 05/15/2023 | 05/25/2023 |
2 | 08/27/2023 | 19 | 12 | 06/01/2023 |
For each engine, I need to query the cyl1_read
and cyl2_read
values that were read after the most recent corresponding cyl last maintenance date, if the corresponding cyl last maint date exist, otherwise get all the cyl read values.
So, for this example the result would be:
engine | read_date | cyl1_read | cyl2_read |
---|---|---|---|
1 | 06/31/2023 | 21 | |
1 | 11/01/2023 | 27 | 19 |
2 | 08/27/2023 | 19 | 12 |
Since the cyl1_last_maint_date
for engine 1 was on 7/15/2023, the cyl1_read
that was read on 06/31/2023 would not be included, but the cyl2_read
value that was read on 6/31/2023 would be included because the cyl2_last_maint_date
(05/25/2023) was before the read date. The engine 2 values captured on 04/11/2023 would not be included because they were read before the last maint dates for both cyl1 and cyl2. The last engine 2 values read on 08/27/2023 would be included because there is no last maint date for cyl1 and the cyl2 last maint date is before the cyl2 last read date.
How can I accomplish this SQL query? I am running SQL Server 2016
I think this is easiest if you take a 2-step process
It is possible to do this in one step, but I think it's much cleaner and clearer to do it in two.
Assuming the data is in the table #EngineData, The following SQL fragment only includes non-future reads
SELECT engine, read_date,
CASE WHEN cyl1_last_maint_date <= read_date OR cyl1_last_maint_date IS NULL THEN cyl1_read ELSE NULL END AS cyl1_last_read,
CASE WHEN cyl2_last_maint_date <= read_date OR cyl2_last_maint_date IS NULL THEN cyl2_read ELSE NULL END AS cyl2_last_read
FROM #EngineData;
Results
engine read_date cyl1_last_read cyl2_last_read
----------- ---------- -------------- --------------
1 2023-06-30 NULL 21
1 2023-11-01 27 19
2 2023-04-11 NULL NULL
2 2023-08-27 19 12
Then we can simply select from this where the last read for either cyl1 or cyl2 have a value (e.g., both not NULL). The initial step is in the CTE below, and the WHERE clause then removes the rows with no reads.
WITH PrevOnly_EngineData AS
(SELECT engine, read_date,
CASE WHEN cyl1_last_maint_date <= read_date OR cyl1_last_maint_date IS NULL THEN cyl1_read ELSE NULL END AS cyl1_last_read,
CASE WHEN cyl2_last_maint_date <= read_date OR cyl2_last_maint_date IS NULL THEN cyl2_read ELSE NULL END AS cyl2_last_read
FROM #EngineData
)
SELECT *
FROM PrevOnly_EngineData
WHERE cyl1_last_read IS NOT NULL
OR cyl2_last_read IS NOT NULL
ORDER BY engine, read_date;
Results
engine read_date cyl1_last_read cyl2_last_read
----------- ---------- -------------- --------------
1 2023-06-30 NULL 21
1 2023-11-01 27 19
2 2023-08-27 19 12
You can see these on the following db<>fiddle. (Note - I replaced date 2023-06-31 with 2023-06-30).