I have parent-child relation in my table (simplified version listed below):
| parent | child |
|--------+-------|
| p1 | c1 |
| p1 | c2 |
|--------+-------|
| p2 | c1 |
| p2 | c2 |
|--------+-------|
| p3 | c1 |
|--------+-------|
| p4 | c2 |
|--------+-------|
| p5 | c3 |
I need to query all parents with the exact same set of children like p1
and p2
.
Seems to be related to this question, but there are hard-coded children.
This query lists all parents sharing the same children, where there are more than one parent:
SELECT array_agg(parent) AS parents, children, count(*) AS dupes
FROM (
SELECT parent, array_agg(child ORDER BY child) AS children
FROM tbl
GROUP BY 1
) sub
GROUP BY 2
HAVING count(*) > 1;
Simple, but not the fastest possible solution. For big tables, I would try something else.
The problem belongs to the "relational division" family. Here is a collection of query styles we assembled some time ago:
How to filter SQL results in a has-many-through relation
One way (of many) to get single parent per row:
WITH cte AS (
SELECT parent, array_agg(child ORDER BY child) AS children
FROM tbl
GROUP BY 1
)
SELECT *
FROM cte c
WHERE EXISTS (
SELECT 1
FROM cte c1
WHERE c1.children = c.children
AND c1.parent <> c.parent
);