CREATE TABLE candidates
(
row_num BIGINT,
in_id BIGINT,
out_id BIGINT
);
Fill it with some data:
INSERT INTO candidates (row_num, in_id, out_id)
VALUES (1, 220, 210),
(2, 208, 210),
(3, 216, 210),
(4, 220, 214),
(5, 208, 214),
(6, 216, 214),
(7, 212, 218);
What i now want is to select all rows in a way that each in_id
and out_id
is unique across all selected rows.
So we'd select the first row, because we didn't see either the in_id
220 nor the out_id
210 before. The pair (220,210)
is selected and therefore no following row is allowed to have the in_id
220 or the out_id
210. That's why we need to skip the next 3 rows (they either have in_id
220 or out_id
210). Now we end up at row_num
5. Neither 208 nor 214 where selected before. Thats why it also should be part of the final result set. Same goes for row_num
7.
The final result set should look like:
row_num | in_id | out_id
1 220 210
5 208 214
7 212 218
Seems so simple but i can't solve it. Cleanest try is the following query which has the problem that despite that it knows which ids appeared before it can't detect if a row was selected or not. Breaks for the case of row_num
5.
SELECT *
FROM candidates tc1
WHERE NOT EXISTS (
SELECT
FROM candidates tc2
WHERE (tc2.in_id = tc1.in_id OR tc2.out_id = tc1.out_id)
AND tc2.row_num < tc1.row_num
);
Any help greatly appreciated, thanks!
This can be done using WITH recursive
:
with recursive cte as (
select row_num, in_id, out_id, 1 as exist
from candidates
where row_num = 1
union all
select t.row_num, t.in_id, t.out_id, exist+1
from cte c
inner join candidates t on t.row_num > c.row_num
and (t.in_id <> c.in_id and t.out_id <> c.out_id)
)
select t.row_num, t.in_id, t.out_id
from candidates t
inner join (
select min(row_num) as row_num, exist
from cte
group by exist
) as s on s.row_num = t.row_num
The recursive CTE used to get records recursively by joining the previous records with the current table where current row bigger than already found records and current in_id and out_id not exist in previous records.