Working with Oracle 11g, I'm trying to find the smallest ID in a undirected graph of ID pairs.
Using the following pairs:
create table unique_pairs ( ID1 INT, ID2 INT );
insert all
into unique_pairs (ID1,ID2) VALUES ( 1, 2 )
into unique_pairs (ID1,ID2) VALUES ( 1, 3 )
into unique_pairs (ID1,ID2) VALUES ( 4, 2 )
into unique_pairs (ID1,ID2) VALUES ( 4, 5 )
into unique_pairs (ID1,ID2) VALUES ( 7, 6 )
into unique_pairs (ID1,ID2) VALUES ( 8, 10 )
into unique_pairs (ID1,ID2) VALUES ( 10, 7 )
select * from dual;
commit;
I'm using a union of two connect by queries to try and travel in two directions of the map:
select distinct a.ID1, a.ID2, min(a.ultimate_parent_id ) as ultimate_parent_id
from
(SELECT distinct ID1, ID2,
connect_by_root(ID2) ultimate_parent_id
FROM unique_pairs
CONNECT BY ID2 = prior ID1 AND ID2 != PRIOR ID2
union
SELECT distinct ID1, ID2,
connect_by_root(ID1) ultimate_parent_id
FROM unique_pairs
CONNECT BY ID1 = prior ID2 AND ID1 != PRIOR ID1) a
group by a.ID1, a.ID2
order by 3;
I get this:
ID1 ID2 ULTIMATE_PARENT_ID
1 2 1
1 3 1
4 2 2
4 5 4
10 7 6
7 6 6
8 10 6
and should be this:
ID1 ID2 ULTIMATE_PARENT_ID
1 2 1
1 3 1
4 2 1
4 5 1
10 7 6
7 6 6
8 10 6
My problem I think relates to the fact that the connect by function is implying a hierarchical structure when my data is non directional. Any help would be great, thank you!
Collapsar has a very lengthy description, but here's a very brief solution that matches your requirements. I'm too far removed from my graph theory days to explain it as he does, but the simple explanation is that since your unique_pairs represents the edges in an undirected graph it can be represented in a directed graph as the union of ID1, ID2 with ID2, ID1 hence my first common table expression. With that directed graph you can compute all connected pairs of nodes as in my second common table expression. Finally you can join your original unique_pairs table to my connected pairs CTE on ID1 and take the minimum root node as your ULTIMATE_PARENT_ID:
with dg1 as (
select id1, id2 from unique_pairs
union all
select id2, id1 from unique_pairs
), cp as (
select id1
, CONNECT_BY_ROOT id1 root
from dg1
connect by nocycle prior id1 = id2
)
select dg1.*
, min(root) ULTIMATE_PARENT_ID
from unique_pairs dg1
join cp
on cp.id1 = dg1.id1
group by dg1.id1, dg1.id2
order by 1,2
See the SQL Fiddle
| ID1 | ID2 | ULTIMATE_PARENT_ID |
|-----|-----|--------------------|
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 4 | 2 | 1 |
| 4 | 5 | 1 |
| 7 | 6 | 6 |
| 8 | 10 | 6 |
| 10 | 7 | 6 |