I have a table representing sets of records like
set_id | record_id
a | 1
a | 2
a | 3
b | 2
b | 4
b | 5
c | 6
c | 7
d | 9
d | 11
e | 10
f | 11
f | 12
I want to yield output like this
output
{1, 2, 3, 4, 5}
{6, 7}
{9, 11, 12}
{10}
Where intersecting sets are combined (notice set a and set b have been combined; d and f have also been combined).
Is there a good way of doing this with SQL, not a stored procedure. I know that I'm looking for a kind of Union-Find procedure.
prepare:
so=> create table so75(set_id text, record_id int);
CREATE TABLE
so=> copy so75 from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> ^CERROR: COPY from stdin failed: canceled by user
CONTEXT: COPY so75, line 1
so=> copy so75 from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> a | 1
a | 2
a | 3
b | 2
b | 4
b | 5
c | 6
c | 7
d | 9
d | 11
e | 10
f | 11
f | 12
>> >> >> >> >> >> >>
>> \.
COPY 14
qry:
so=> with keys as (
with a as (
select *,count(1) over (partition by record_id) c, array_agg(set_id) over(partition by record_id) cc
from so75
)
select set_id, cc
from a where c > 1
)
select distinct array_agg(distinct record_id)
from so75
left outer join keys on keys.set_id = so75.set_id
group by case when array_length(cc,1) > 1 then cc::text else so75.set_id end;
array_agg
-------------
{6,7}
{10}
{1,2,3,4,5}
{9,11,12}
(4 rows)