Search code examples
oraclegraphbidirectional

Smallest ID from an Undirected graph


Working with Oracle 11g, I'm trying to find the smallest ID in a undirected graph of ID pairs.

Using the following pairs:

create table unique_pairs ( ID1 INT, ID2 INT );
insert all 
    into unique_pairs (ID1,ID2) VALUES ( 1, 2 )  
    into unique_pairs (ID1,ID2) VALUES ( 1, 3 )
    into unique_pairs (ID1,ID2) VALUES ( 4, 2 )
    into unique_pairs (ID1,ID2) VALUES ( 4, 5 )
    into unique_pairs (ID1,ID2) VALUES ( 7, 6 )
    into unique_pairs (ID1,ID2) VALUES ( 8, 10 )
    into unique_pairs (ID1,ID2) VALUES ( 10, 7 )
select * from dual;
commit;

I'm using a union of two connect by queries to try and travel in two directions of the map:

select distinct a.ID1, a.ID2, min(a.ultimate_parent_id ) as ultimate_parent_id
from
(SELECT distinct ID1, ID2,
    connect_by_root(ID2) ultimate_parent_id
  FROM unique_pairs
CONNECT BY ID2 = prior ID1 AND ID2 != PRIOR ID2 
union
SELECT distinct ID1, ID2,
    connect_by_root(ID1) ultimate_parent_id
  FROM unique_pairs
CONNECT BY ID1 = prior ID2 AND ID1 != PRIOR ID1) a
group by a.ID1, a.ID2
order by 3;

I get this:

   ID1        ID2 ULTIMATE_PARENT_ID

     1          2                  1
     1          3                  1
     4          2                  2
     4          5                  4
    10          7                  6
     7          6                  6
     8         10                  6

and should be this:

   ID1        ID2 ULTIMATE_PARENT_ID

     1          2                  1
     1          3                  1
     4          2                  1
     4          5                  1
    10          7                  6
     7          6                  6
     8         10                  6 

My problem I think relates to the fact that the connect by function is implying a hierarchical structure when my data is non directional. Any help would be great, thank you!


Solution

  • Collapsar has a very lengthy description, but here's a very brief solution that matches your requirements. I'm too far removed from my graph theory days to explain it as he does, but the simple explanation is that since your unique_pairs represents the edges in an undirected graph it can be represented in a directed graph as the union of ID1, ID2 with ID2, ID1 hence my first common table expression. With that directed graph you can compute all connected pairs of nodes as in my second common table expression. Finally you can join your original unique_pairs table to my connected pairs CTE on ID1 and take the minimum root node as your ULTIMATE_PARENT_ID:

    with dg1 as (
      select id1, id2 from unique_pairs
      union all
      select id2, id1 from unique_pairs
    ), cp as (
    select id1
         , CONNECT_BY_ROOT id1 root
      from dg1
    connect by nocycle prior id1 = id2 
    )
    select dg1.*
         , min(root) ULTIMATE_PARENT_ID
      from unique_pairs dg1
      join cp
        on cp.id1 = dg1.id1
      group by dg1.id1, dg1.id2
    order by 1,2
    

    See the SQL Fiddle

    | ID1 | ID2 | ULTIMATE_PARENT_ID |
    |-----|-----|--------------------|
    |   1 |   2 |                  1 |
    |   1 |   3 |                  1 |
    |   4 |   2 |                  1 |
    |   4 |   5 |                  1 |
    |   7 |   6 |                  6 |
    |   8 |  10 |                  6 |
    |  10 |   7 |                  6 |