I am able to generate a recursion query thanks to this post here: SQL Server recursion with multiple tables
But now I'm trying to get what a full listing of all the parent/child relationships in for multiple levels. Here is the example:
create table groups_groups (
group_id int not null,
child_id int not null
)
create table groups(group_id int not null)
insert into groups_groups values (78,80)
insert into groups_groups values (80,79)
insert into groups values (78)
insert into groups values (70)
insert into groups values (80)
;
with cte as (
select
0 as gglevel,
0 as parentid,
groups.group_id ,
convert(varchar(max), concat(',', 0,':',groups.group_id)) as visited
from groups
union all
select
gglevel+1 as gglevel,
gg.group_id as parentid,
gg.child_id,
convert(varchar(max), concat(cte.visited,',',gg.group_id,':',gg.child_id))
as visited
from cte
join groups_groups gg
on gg.group_id = cte.group_id
where visited not like concat('%,',gg.group_id,':',gg.child_id, ',%')
)
select * from cte
order by group_id,gglevel
gglevel | parentid | group_id | visited |
---|---|---|---|
0 | 0 | 70 | ,0:70 |
0 | 0 | 78 | ,0:78 |
1 | 80 | 79 | ,0:80,80:79 |
2 | 80 | 79 | ,0:78,78:80,80:79 |
0 | 0 | 80 | ,0:80 |
1 | 78 | 80 | ,0:78,78:80 |
In the above, the visited column is there because in my full dataset, we have many loops and places where the parent equals the child and without, would have infinite loops.
This mostly works, except in the 4th row we see parent to group of 80-79 again. What I would hope to see is one row where 80 gives 79 (correct) and the level 2 row should be 78 gives 79 (it does because of 80 in the middle).
Is there a way to have levels 2 (and higher since we can have up to 10-20 levels) show all the possible combinations, all of the ways that I could be granted 79?
And is there a better way to prevent the infinite loop than this string concat/where clause?
Thank you!
A few thoughts:
visited
list. On the other hand, the "parent_id:" portion can probably be dropped.Applying these changes to your posted query, we get:
with cte as (
-- Anchor part
select
0 as gglevel,
groups.group_id as starting_id, -- *** Changed ***
groups.group_id,
convert(varchar(max),
concat(',' ,groups.group_id, ',')) as visited -- *** Changed ***
from groups
union all
-- Recursive part
select
gglevel+1 as gglevel,
cte.starting_id, -- *** Changed ***
gg.child_id,
convert(varchar(max), concat(cte.visited, gg.child_id, ',')) -- *** Changed ***
as visited
from cte
join groups_groups gg
on gg.group_id = cte.group_id
where visited not like concat('%,', gg.child_id, ',%') -- *** Changed ***
)
select * from cte
order by
starting_id,
gglevel,
group_id;
Results (with current sample data):
gglevel | starting_id | group_id | visited |
---|---|---|---|
0 | 70 | 70 | ,70, |
0 | 78 | 78 | ,78, |
1 | 78 | 80 | ,78,80, |
2 | 78 | 79 | ,78,80,79, |
0 | 80 | 80 | ,80, |
1 | 80 | 79 | ,80,79, |
To include 0-node starting values, you can add the following data:
insert into groups_groups values (0,78);
insert into groups_groups values (0,80);
insert into groups_groups values (0,79);
insert into groups values (0);
Alternately, you could kluge-up the CTE with additional anchor queries to fill in the missing 0-node data.
with cte as (
-- Anchor part (1 of 3) - 0:0
select
0 as gglevel,
0 as starting_id, -- *** Changed ***
0 as group_id,
convert(varchar(max), ',0,') as visited -- *** Changed ***
union all
-- Anchor part (2 of 3) - 0:group (not defined in
select
1 as gglevel,
0 as starting_id, -- *** Changed ***
groups.group_id,
convert(varchar(max), concat(',0,' ,groups.group_id, ',')) as visited -- *** Changed ***
from groups
-- Anchor part (3 of 3) - group:group
union all
select
0 as gglevel,
groups.group_id as starting_id, -- *** Changed ***
groups.group_id,
convert(varchar(max), concat(',' ,groups.group_id, ',')) as visited -- *** Changed ***
from groups
union all
-- Recursive part
select
gglevel+1 as gglevel,
cte.starting_id, -- *** Changed ***
gg.child_id,
convert(varchar(max), concat(cte.visited, gg.child_id, ',')) -- *** Changed ***
as visited
from cte
join groups_groups gg
on gg.group_id = cte.group_id
where visited not like concat('%,', gg.child_id, ',%') -- *** Changed ***
)
select * from cte
order by
starting_id,
gglevel,
group_id;
Results (with 0-node starting values):
gglevel | starting_id | group_id | visited |
---|---|---|---|
0 | 0 | 0 | ,0, |
1 | 0 | 78 | ,0,78, |
1 | 0 | 79 | ,0,79, |
1 | 0 | 80 | ,0,80, |
2 | 0 | 79 | ,0,80,79, |
2 | 0 | 80 | ,0,78,80, |
3 | 0 | 79 | ,0,78,80,79, |
0 | 70 | 70 | ,70, |
0 | 78 | 78 | ,78, |
1 | 78 | 80 | ,78,80, |
2 | 78 | 79 | ,78,80,79, |
0 | 80 | 80 | ,80, |
1 | 80 | 79 | ,80,79, |
The above reflects multiple paths to get from the 0-node to several other nodes. (This could also be the case with non-zero starting nodes with different data.) It also shows the start = end cases. You can add filtering logic (deduping, shortest path, etc.) to trim the results down to suit your preferences.
See this db<>fiddle for a demo.