Search code examples
sqlpostgresqlcode-duplication

Postgresql statement to represent rows duplicate in column way


My data looks like:

Member_ID Housing_ID National_ID Family_relation
1 1 3214565 Head
2 1 1234567 Wife
3 1 1223344 Sun
4 1 3224451 Sun
5 1 5432175 Daughter
1 2 1223344 Head
2 2 4321678 Wife
3 2 4356723 Sun
1 3 3214565 Head
2 3 1234557 Brother
1 4 7653432 Head
2 4 3224451 Grand daughter

as you can see, there is a dulication of National ID in different houses:

1- the member 3 of house 1 with National ID 1223344 (sun) is (head) of house 2, 
2- the member 4 of house 1 with National ID 3224451 (sun) is (Grand daughter) of house 4

I have to specify all duplicated scenarios in order to remove it, I identify the duplication by typing : select National ID , count(*) from my table group by National ID having count(*)>1 and got this output:

National ID count
1223344 2
3224451 2

I'm tiring to query the duplication so the out put looks like:

National ID Housing ID 1 Relation Housing ID 2 Relation
1223344 1 sun 2 Head
3224451 1 sun 4 Grand daughter

Thanks


Solution

  • For National_ID with count(*) = 2, the following query should provide the expected result :

    select National ID
         , array_agg(Housing_ID)[1] AS "Housing ID 1"
         , array_agg(Family_relation)[1] AS "Relation 1"
         , array_agg(Housing_ID)[2] AS "Housing ID 2"
         , array_agg(Family_relation)[2] AS "Relation 2"
      from my table 
     group by National_ID 
    having count(*) = 2