I'm writing a query designed to check records that exist in 6 different tables for a specific value and, if it's null, indicate that the value is missing (meaning NULL).
However, what I'm struggling with is in the case that the record is missing more than one of the values being checked. Since there is a chance that any amount up to the max of the 5 checked values is missing, I need to be able to output the appropriate amount of flags based on each record and what it's missing. Everything I've seen around can get close in various ways but either feels counterintuitive, clunky, or more complex than it feels like it should be.
Here's a baseline for what I've got so far:
-- INIT database
CREATE TABLE Table0 ([UniqID] INT);
CREATE TABLE Table1 ([UniqID] INT, [Value1] VARCHAR(50));
CREATE TABLE Table2 ([UniqID] INT, [Value2] VARCHAR(50));
CREATE TABLE Table3 ([UniqID] INT, [Value3] VARCHAR(50));
CREATE TABLE Table4 ([UniqID] INT, [Value4] VARCHAR(50));
CREATE TABLE Table5 ([UniqID] INT, [Value5] VARCHAR(50));
-- ADD data
INSERT INTO Table0([UniqID]) VALUES ('1001');
INSERT INTO Table0([UniqID]) VALUES ('1002');
INSERT INTO Table0([UniqID]) VALUES ('1003');
INSERT INTO Table0([UniqID]) VALUES ('1004');
INSERT INTO Table1([UniqID], [Value1]) VALUES ('1001', NULL);
INSERT INTO Table1([UniqID], [Value1]) VALUES ('1002', NULL);
INSERT INTO Table1([UniqID], [Value1]) VALUES ('1003', 'Record3');
INSERT INTO Table1([UniqID], [Value1]) VALUES ('1004', NULL);
INSERT INTO Table2([UniqID], [Value2]) VALUES ('1001', NULL);
INSERT INTO Table2([UniqID], [Value2]) VALUES ('1002', 'Record2');
INSERT INTO Table2([UniqID], [Value2]) VALUES ('1003', 'Record3');
INSERT INTO Table2([UniqID], [Value2]) VALUES ('1004', 'Record4');
INSERT INTO Table3([UniqID], [Value3]) VALUES ('1001', NULL);
INSERT INTO Table3([UniqID], [Value3]) VALUES ('1002', 'Record2');
INSERT INTO Table3([UniqID], [Value3]) VALUES ('1003', 'Record3');
INSERT INTO Table3([UniqID], [Value3]) VALUES ('1004', 'Record4');
INSERT INTO Table4([UniqID], [Value4]) VALUES ('1001', 'Record1');
INSERT INTO Table4([UniqID], [Value4]) VALUES ('1002', NULL);
INSERT INTO Table4([UniqID], [Value4]) VALUES ('1003', 'Record3');
INSERT INTO Table4([UniqID], [Value4]) VALUES ('1004', NULL);
INSERT INTO Table5([UniqID], [Value5]) VALUES ('1001', 'Record1');
INSERT INTO Table5([UniqID], [Value5]) VALUES ('1002', 'Record2');
INSERT INTO Table5([UniqID], [Value5]) VALUES ('1003', NULL);
INSERT INTO Table5([UniqID], [Value5]) VALUES ('1004', NULL);
-- Main query
SELECT
Table0.[UniqID],
STRING_AGG(CASE WHEN Table1.[Value1] IS NULL THEN 'Flag 1' ELSE '' END
CASE WHEN Table2.[Value2] IS NULL THEN 'Flag 2' ELSE '' END
CASE WHEN Table3.[Value3] IS NULL THEN 'Flag 3' ELSE '' END
CASE WHEN Table4.[Value4] IS NULL THEN 'Flag 4' ELSE '' END
CASE WHEN Table5.[Value5] IS NULL THEN 'Flag 5' ELSE '' END, ', ') AS 'Reason(s) for Inclusion'
FROM
Table0
-- JOIN section
LEFT JOIN
Table1 ON Table0.[UniqID] = Table1.[UniqID]
LEFT JOIN
Table2 ON Table0.[UniqID] = Table2.[UniqID]
LEFT JOIN
Table3 ON Table0.[UniqID] = Table3.[UniqID]
LEFT JOIN
Table4 ON Table0.[UniqID] = Table4.[UniqID]
LEFT JOIN
Table5 ON Table0.[UniqID] = Table5.[UniqID]
-- Filtering to pull where NULLs may exist
WHERE
Table1.[Value1] IS NULL
OR Table2.[Value2] IS NULL
OR Table3.[Value3] IS NULL
OR Table4.[Value4] IS NULL
OR Table5.[Value5] IS NULL
The desired output should be something like this:
UniqID Reason(s) for Inclusion
--------------------------------
1001 Flag 1, Flag 2, Flag3
1002 Flag 1, Flag 4
1003 Flag 5
1004 Flag 1, Flag 4, Flag 5
I've tried using CASE
expressions on their own, both in and outside of a STRING_AGG()
, trying to use a CONCAT()
with several independent CASE
expressions, simply tying the results together like below using both independent CASE
expressions and ISNULL()
:
ISNULL(Table1.[Value1],Flag1) + ', ' + ISNULL(Table2.[Value2],Flag2)...
If there's more info I can provide, I'm happy to try as a Stack Overflow newbie!
You're almost there. I just tweaked the way you determine the 'Reasons for inclusion' column.
Instead of the STRING_AGG
, I just used +
to combine the strings.
Note also that each value starts with a comma + space e.g., , Flag 1
. This means a comma + space will only be added if we're also adding a 'Flag'; however, it also means the resulting string will always start with comma + space. Therefore I use the STUFF
function to get rid of those two characters.
SELECT Table0.[UniqID],
STUFF(
CASE WHEN Table1.[Value1] IS NULL THEN ', Flag 1' ELSE '' END
+ CASE WHEN Table2.[Value2] IS NULL THEN ', Flag 2' ELSE '' END
+ CASE WHEN Table3.[Value3] IS NULL THEN ', Flag 3' ELSE '' END
+ CASE WHEN Table4.[Value4] IS NULL THEN ', Flag 4' ELSE '' END
+ CASE WHEN Table5.[Value5] IS NULL THEN ', Flag 5' ELSE '' END,
1, 2, '')
AS 'Reason(s) for Inclusion'
Results
UniqID Reason(s) for Inclusion
1001 Flag 1, Flag 2, Flag 3
1002 Flag 1, Flag 4
1003 Flag 5
1004 Flag 1, Flag 4, Flag 5