I have a table with data like following, want to return those group_id with unique data. Both group_id 3 and 4 have two component 123 and 456, so they are "duplicated", we just need to return the smaller group_id, that's 3. Also group_id 5 doesn't have a duplication, it can be returned. So we want group_id 3 and 5 to be returned.
How can I write a SQL query against postgres database to achieve that? Thank you!
id | group_id | component_id |
---|---|---|
1 | 3 | 123 |
2 | 3 | 456 |
3 | 4 | 123 |
4 | 4 | 456 |
5 | 5 | 123 |
Use 2 levels of aggregation:
SELECT MIN(group_id) group_id
FROM (
SELECT group_id, STRING_AGG(component_id::text, ',' ORDER BY component_id) components
FROM tablename
GROUP BY group_id
) t
GROUP BY components;
See the demo.