Search code examples
snowflake-cloud-data-platform

How to find and count unique values across columns?


I'm using Snowflake and I'm hoping someone can help me understand how to count the number of unique names across columns for each row and overlook null values. Here is a sample of the data. So you can see below I'm looking to count the number of distinct values across the variables Name 1, Name 2, name 3, Name 4.

ID | Type    | Name 1 | Name 2 | Name 3 | Name 4 | Expected result
1  | animal  | cat    | Dog    | null   | Dog    | 2   
2  | animal  | fish   | cat    | cat    | cat    | 2
3  | animal  | fish   | cat    | dog    | rat    | 4

Solution

  • You could use a unpivot approach:

    SELECT ID, Type, COUNT(DISTINCT Name) AS cnt
    FROM
    (
        SELECT ID, Type, Name1 AS Name FROM yourTable UNION ALL
        SELECT ID, Type, Name2         FROM yourTable UNION ALL
        SELECT ID, Type, Name3         FROM yourTable UNION ALL
        SELECT ID, Type, Name4         FROM yourTable
    ) t
    GROUP BY ID, Type;
    

    Demo

    This approach works by unpivoting the name data to a format where one record has just one ID and one name. Then, we aggregate and take the distinct count. The COUNT() function works well here, because by default it ignores NULL values, which is the behavior you want.