I have two columns where in the first row (inspection), there are some violations that are spotted and noted... can be 15+ different violations.
Then on every subsequent inspection, there can be some violations that are completed (those that are resolved can NOT show again as unresolved). Number can vary and can even be that inspection didn't see any of remaining violations resolved so for that one it would be an empty row but still has to be counted.
Inspection | Date | Violation |
---|---|---|
1 | 6/1/2024 | 100 |
1 | 6/1/2024 | 101 |
1 | 6/1/2024 | 102 |
1 | 6/1/2024 | 103 |
2 | 6/2/2024 | 100 |
2 | 6/2/2024 | 101 |
3 | 6/7/2024 | NULL |
4 | 6/8/2024 | 103 |
5 | 6/9/2024 | 102 |
What I need as a result are unresolved violations after each inspection ...
I need this in T-SQL. I know how to use LAG
, LEAD
, FIRST_VALUE
etc. Saying that as I tried to resolve it via window funcs. I have created func to get difference between the values (like violations remaining - violations in inspection) but really what I can't figure is how to move past 2nd inspection
Those are aggregates of the tables beneath and if you have some kind of advice I can go and change queries that bring data in this format. Only thing is that for an inspection I do not have recorded answers for each of violations spotted on first but only one that is cleared/resolved
Thanks in advance
SELECT aggdata.Inspection
, RemainingViolations =
CASE
WHEN aggdata.Inspection = 1 THEN aggdata.AggregatedViolations
ELSE dbo.udf_GetDifferenceInValues(FIRST_VALUE(aggdata.AggregatedViolations) OVER (ORDER BY aggdata.Inspection), aggdata.AggregatedViolations)
END
FROM (
SELECT SimpleData.Inspection
, AggregatedViolations = STRING_AGG(SimpleData.Violation, ',')
FROM (VALUES (1, '6/1/2024', 100),
(1, '6/1/2024', 101),
(1, '6/1/2024', 102),
(1, '6/1/2024', 103),
(2, '6/1/2024', 100),
(2, '6/1/2024', 101),
(3, '6/1/2024', NULL),
(4, '6/1/2024', 103),
(5, '6/1/2024', 102)
) AS
SimpleData (Inspection, DateInspection, Violation)
GROUP BY SimpleData.Inspection
) aggdata
ALTER FUNCTION udf_GetDifferenceInValues
(
-- Add the parameters for the function here
@String1 VARCHAR(4000)
, @String2 VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
-- Declare the return variable here
DECLARE @Result VARCHAR(4000)
-- Add the T-SQL statements to compute the return value here
SELECT @Result = STRING_AGG(Value, ',')
FROM dbo.dbSplit(@String1, ',')
WHERE Value NOT IN (
SELECT Value
FROM dbo.dbSplit(@String2, ',')
)
-- Return the result of the function
RETURN @Result
END
/* *** Test Data *** */
CREATE TABLE #t
(
Inspection int NOT NULL
,[Date] date NOT NULL
,Violation int NULL
);
INSERT INTO #t
VALUES (1, '20240601', 100)
,(1, '20240601', 101)
,(1, '20240601', 102)
,(1, '20240601', 103)
,(2, '20240602', 100)
,(2, '20240602', 101)
,(3, '20240607', NULL)
,(4, '20240608', 103)
,(5, '20240609', 102);
/* *** End Test Data *** */
WITH vCounts
AS
(
SELECT X.Inspection, T.Violation AS AViolation, T2.Violation
,COUNT(T2.Violation) OVER (PARTITION BY T.Violation ORDER BY X.Inspection ROWS UNBOUNDED PRECEDING) AS VCount
FROM #t T
JOIN
(
SELECT DISTINCT Inspection
FROM #t
) X (Inspection)
ON T.Inspection = 1
LEFT JOIN #t T2
ON X.Inspection = T2.Inspection
AND T.Violation = T2.Violation
)
,VGroups
AS
(
SELECT Inspection
,STRING_AGG(Violation, ',') WITHIN GROUP (ORDER BY AViolation) AS Violations
,STRING_AGG(IIF(Violation IS NULL AND VCount = 1, AViolation, NULL), ',') WITHIN GROUP (ORDER BY AViolation) AS ActiveViolations
FROM VCounts
GROUP BY Inspection
)
SELECT Inspection, Violations
,IIF(Inspection = 1, Violations, ActiveViolations) AS ActiveViolations
FROM VGroups
ORDER BY Inspection;