My data looks like:
Member_ID | Housing_ID | National_ID | Family_relation |
---|---|---|---|
1 | 1 | 3214565 | Head |
2 | 1 | 1234567 | Wife |
3 | 1 | 1223344 | Sun |
4 | 1 | 3224451 | Sun |
5 | 1 | 5432175 | Daughter |
1 | 2 | 1223344 | Head |
2 | 2 | 4321678 | Wife |
3 | 2 | 4356723 | Sun |
1 | 3 | 3214565 | Head |
2 | 3 | 1234557 | Brother |
1 | 4 | 7653432 | Head |
2 | 4 | 3224451 | Grand daughter |
as you can see, there is a dulication of National ID in different houses:
1- the member 3 of house 1 with National ID 1223344 (sun) is (head) of house 2,
2- the member 4 of house 1 with National ID 3224451 (sun) is (Grand daughter) of house 4
I have to specify all duplicated scenarios in order to remove it, I identify the duplication by typing : select National ID , count(*) from my table group by National ID having count(*)>1
and got this output:
National ID | count |
---|---|
1223344 | 2 |
3224451 | 2 |
I'm tiring to query the duplication so the out put looks like:
National ID | Housing ID 1 | Relation | Housing ID 2 | Relation |
---|---|---|---|---|
1223344 | 1 | sun | 2 | Head |
3224451 | 1 | sun | 4 | Grand daughter |
Thanks
For National_ID with count(*) = 2, the following query should provide the expected result :
select National ID
, array_agg(Housing_ID)[1] AS "Housing ID 1"
, array_agg(Family_relation)[1] AS "Relation 1"
, array_agg(Housing_ID)[2] AS "Housing ID 2"
, array_agg(Family_relation)[2] AS "Relation 2"
from my table
group by National_ID
having count(*) = 2