Search code examples
sqlsql-servergraph-theoryrecursive-query

How can I combine Group Identifiers into Single Group?


I have a dataset as follows:

;WITH CTE AS
( SELECT *
  FROM (VALUES
(1, 10, 20, 30)
(2, 10, 21, 31)
(3, 11, 21, 31)
(4, 12, 22, 32)
(5, 13, 23, 33)
(6, 14, 24, 33)
(7, 14, 25, 34)
(8, 15, 26, 36)
) AS MyValues(ID, GroupID1, GroupID2, GroupID3)
)
SELECT *
FROM CTE

How can I generate the following collapsing the individual groups into a single group?

ID SingleGroupID
1 1
2 1
3 1
4 2
5 3
6 3
7 3
8 4

Solution

  • This is a typical graph-walking problem, where you want to build islands of ids that have at least one group in common.

    We can start by unpivoting the groups to build a list of nodes, and generate edges with a self-join (edges connect ids that share the same group). We can then recursively traverse the edges, while keeping track of the path we followed until there. The last step is to aggregate by id.

    So:

    with 
        nodes as (
            select t.id, v.grp
            from mytable t
            cross apply ( values (t.GroupID1), (t.GroupID2), (t.GroupID3) ) v(grp)
        ),
        edges as (
            select distinct n1.id as id1, n2.id as id2
            from nodes n1
            inner join nodes n2 on n1.grp = n2.grp
        ),
        rec as (
            select id1, id2, cast(id1 as nvarchar(max)) as visited from edges
            union all
            select r.id1, e.id2, concat(r.visited, ',', e.id2)
            from rec r
            inner join edges e on e.id1 = r.id2
            where concat(',', r.visited, ',') not like concat('%,', e.id2, ',%')
        ),
        fin as (
            select id1, min(value) min_id
            from rec r
            cross apply string_split(r.visited, ',')
            group by id1
        )
    select id1 as id, dense_rank() over(order by min_id) grp
    from fin f
    
    id grp
    1 1
    2 1
    3 1
    4 2
    5 3
    6 3
    7 3
    8 4

    fiddle