Search code examples
postgresqlself-joinarray-agg

postgresql : self join with array


My question is about forming Postgres SQL query for below use case

Approach#1

I have a table like below where I generate the same uuid across different types(a,b,c,d) like mapping different types.

+----+------+-------------+
| id | type | master_guid |
+----+------+-------------+
|  1 | a    | uuid-1      |
|  2 | a    | uuid-2      |
|  3 | a    | uuid-3      |
|  4 | a    | uuid-4      |
|  5 | a    | uuid-5      |
|  6 | b    | uuid-1      |
|  7 | b    | uuid-2      |
|  8 | b    | uuid-3      |
|  9 | b    | uuid-6      |
| 10 | c    | uuid-1      |
| 11 | c    | uuid-2      |
| 12 | c    | uuid-3      |
| 13 | c    | uuid-6      |
| 14 | c    | uuid-7      |
| 15 | d    | uuid-6      |
| 16 | d    | uuid-2      |
+----+------+-------------+

Approach#2

I have a created two tables for id to type and then id to master_guid, like below

table1:

+----+------+
| id | type |
+----+------+
|  1 | a    |
|  2 | a    |
|  3 | a    |
|  4 | a    |
|  5 | a    |
|  6 | b    |
|  7 | b    |
|  8 | b    |
|  9 | b    |
| 10 | c    |
| 11 | c    |
| 12 | c    |
| 13 | c    |
| 14 | c    |
| 15 | d    |
| 16 | d    |
+----+------+

table2

+----+-------------+
| id | master_guid |
+----+-------------+
|  1 | uuid-1      |
|  2 | uuid-2      |
|  3 | uuid-3      |
|  4 | uuid-4      |
|  5 | uuid-5      |
|  6 | uuid-1      |
|  7 | uuid-2      |
|  8 | uuid-3      |
|  9 | uuid-6      |
| 10 | uuid-1      |
| 11 | uuid-2      |
| 12 | uuid-3      |
| 13 | uuid-6      |
| 14 | uuid-7      |
| 15 | uuid-6      |
| 16 | uuid-2      |
+----+-------------+

I want to get output like below with both approaches:

+----+------+--------+------------+
| id | type |  uuid  | mapped_ids |
+----+------+--------+------------+
|  1 | a    | uuid-1 | [6,10]     |
|  2 | a    | uuid-2 | [7,11]     |
|  3 | a    | uuid-3 | [8,12]     |
|  4 | a    | uuid-4 | null       |
|  5 | a    | uuid-5 | null       |
+----+------+--------+------------+

I have tried self-joins with array_agg on ids and grouping based on uuid but not able to get the desired output.

Use below query to populate data:

Approach#1

insert into table1 values 
(1,'a','uuid-1'),
(2,'a','uuid-2'),
(3,'a','uuid-3'),
(4,'a','uuid-4'),
(5,'a','uuid-5'),
(6,'b','uuid-1'),
(7,'b','uuid-2'),
(8,'b','uuid-3'),
(9,'b','uuid-6'),
(10,'c','uuid-1'),
(11,'c','uuid-2'),
(12,'c','uuid-3'),
(13,'c','uuid-6'),
(14,'c','uuid-7'),
(15,'d','uuid-6'),
(16,'d','uuid-2')

Approach#2

insert into table1 values 
(1,'a'),
(2,'a'),
(3,'a'),
(4,'a'),
(5,'a'),
(6,'b'),
(7,'b'),
(8,'b'),
(9,'b'),
(10,'c'),
(11,'c'),
(12,'c'),
(13,'c'),
(14,'c'),
(15,'d'),
(16,'d')

insert into table2 values 
(1,'uuid-1'),
(2,'uuid-2'),
(3,'uuid-3'),
(4,'uuid-4'),
(5,'uuid-5'),
(6,'uuid-1'),
(7,'uuid-2'),
(8,'uuid-3'),
(9,'uuid-6'),
(10,'uuid-1'),
(11,'uuid-2'),
(12,'uuid-3'),
(13,'uuid-6'),
(14,'uuid-7'),
(15,'uuid-6'),
(16,'uuid-2')

Solution

  • Try this:

    select
      t1.id, t1.type, t1.master_guid, array_agg (distinct t2.id)
    from
      table1 t1
      left join table1 t2 on
        t1.master_guid = t2.master_guid and
        t1.id != t2.id
    group by
      t1.id, t1.type, t1.master_guid
    

    I don't come up with exactly the same results you listed, but I thought it was close enought that maybe there was a mistaken expectation on your side or only a small error on mine... either way, a potential starting point.

    -- EDIT --

    For approach #2, I think you just need to add an inner join to Table2 to get the GUID:

    select
      t1.id, t1.type, t2.master_guid,
      array_agg (t2a.id)
    from
      table1 t1
      join table2 t2 on t1.id = t2.id
      left join table2 t2a on
        t2.master_guid = t2a.master_guid and
        t2a.id != t1.id
    where
      t1.type = 'a'
    group by
      t1.id, t1.type, t2.master_guid