I'm using Snowflake and I'm hoping someone can help me understand how to count the number of unique names across columns for each row and overlook null values. Here is a sample of the data. So you can see below I'm looking to count the number of distinct values across the variables Name 1, Name 2, name 3, Name 4.
ID | Type | Name 1 | Name 2 | Name 3 | Name 4 | Expected result
1 | animal | cat | Dog | null | Dog | 2
2 | animal | fish | cat | cat | cat | 2
3 | animal | fish | cat | dog | rat | 4
You could use a unpivot approach:
SELECT ID, Type, COUNT(DISTINCT Name) AS cnt
FROM
(
SELECT ID, Type, Name1 AS Name FROM yourTable UNION ALL
SELECT ID, Type, Name2 FROM yourTable UNION ALL
SELECT ID, Type, Name3 FROM yourTable UNION ALL
SELECT ID, Type, Name4 FROM yourTable
) t
GROUP BY ID, Type;
This approach works by unpivoting the name data to a format where one record has just one ID
and one name. Then, we aggregate and take the distinct count. The COUNT()
function works well here, because by default it ignores NULL
values, which is the behavior you want.