Search code examples
sqldatabaseteradatarecursive-query

Finding all links between IDs in SQL


If I have a data source that contains pairs of IDS like the following:

ID_1 ID_2
0001 0002
NULL 0002
0002 0002
0002 0003
0003 0002
0004 0003

this table can contain ID_1 and ID_2 that are different values, the same values or null values and they can be in the table with values both ways round (0002,0003) and (0003,0002)

How do I create a query that would output all combinations of pairs that are directly or indirectly linked?

i.e. 0001 links to 0002 directly but also to 0003 via 0002 and then to 0004 via 0003


Solution

  • You could try a recursive query using least and greatest functions to avoid infinite recursion. In your sample data, an id_x is connected to id_y and it's possible to have id_y connected to id_x, i.e. (1, 2) and (2, 1); using least and greatest (respectively) will give us only one pair of the two pairs (1, 2).

    with recursive cte(id_1, id_2) as
    (
      select least(id_1, id_2) id_1, greatest(id_1, id_2) id_2 
      from table_name
      where id_1 is not null and id_2 is not null
      union all
      select cte.id_1, t.id_2
      from (
      select least(id_1, id_2) id_1, greatest(id_1, id_2) id_2
      from table_name
      ) t join cte
      on t.id_1 = cte.id_2
      where t.id_1 <> t.id_2 -- to avoid infinite recursion where id_1 = id_2
    )
    select distinct id_1, id_2 from cte
    order by id_1
    

    see a demo on MySQL