If I have a data source that contains pairs of IDS like the following:
ID_1 | ID_2 |
---|---|
0001 | 0002 |
NULL | 0002 |
0002 | 0002 |
0002 | 0003 |
0003 | 0002 |
0004 | 0003 |
this table can contain ID_1 and ID_2 that are different values, the same values or null values and they can be in the table with values both ways round (0002,0003) and (0003,0002)
How do I create a query that would output all combinations of pairs that are directly or indirectly linked?
i.e. 0001 links to 0002 directly but also to 0003 via 0002 and then to 0004 via 0003
You could try a recursive query using least
and greatest
functions to avoid infinite recursion. In your sample data, an id_x is connected to id_y and it's possible to have id_y connected to id_x, i.e. (1, 2) and (2, 1); using least
and greatest
(respectively) will give us only one pair of the two pairs (1, 2).
with recursive cte(id_1, id_2) as
(
select least(id_1, id_2) id_1, greatest(id_1, id_2) id_2
from table_name
where id_1 is not null and id_2 is not null
union all
select cte.id_1, t.id_2
from (
select least(id_1, id_2) id_1, greatest(id_1, id_2) id_2
from table_name
) t join cte
on t.id_1 = cte.id_2
where t.id_1 <> t.id_2 -- to avoid infinite recursion where id_1 = id_2
)
select distinct id_1, id_2 from cte
order by id_1
see a demo on MySQL