Search code examples
sqlpostgresql

Using array_agg to make array out of N to 1 columns


I have a database where rows in table A could have N >= 0 associated rows in table B. I am trying to construct a query that returns all of the columns of A with an array of all of the associated rows in table B. It looks like this should be possible according to the documentation. It looks like I want case 4 aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ] but when I try to use array_agg(*) it results in an error:

No function matches the given name and argument types. You might need to add explicit type casts.

What am I missing here? I have seen the string concatenation approach but if I could just select everything without having to specify columns that would be easier I think.

SQL:

CREATE TABLE containers (
  id bigint PRIMARY KEY
);

CREATE TABLE elements (
  id bigint PRIMARY KEY,
  container_id bigint REFERENCES containers(id)
);

insert into containers VALUES (1), (2), (3);
insert into elements VALUES (1, 1), (2, 1), (3, 2);

SELECT *
FROM containers
JOIN (
  SELECT elements.id, array_agg(*)
  FROM elements
  GROUP BY elements.id
) AS elements ON elements.container_id=containers.id

db_fiddle: https://dbfiddle.uk/C8XgWVLW


Solution

  • You don't share your desired output for this sample data, so I'm sharing two options.

    One, using array_agg which will give a less descriptive output as the column names are not preserved, merely the column ordinals:

    SELECT *
    FROM containers
    JOIN (
      SELECT container_id, array_agg(elements)
      FROM elements
      GROUP BY container_id
    ) AS elements ON elements.container_id=containers.id
    
    id container_id array_agg
    1 1 {"(1,1)","(2,1)"}
    2 2 {"(3,2)"}

    https://dbfiddle.uk/8XJ4FlaE


    And another using json_agg() which serializes the data with the column names:

    SELECT *
    FROM containers
    JOIN (
      SELECT elements.container_id, json_agg(elements)
      FROM elements
      GROUP BY elements.container_id
    ) AS elements ON elements.container_id=containers.id
    
    id container_id json_agg
    1 1 [{"id":1,"container_id":1},  {"id":2,"container_id":1}]
    2 2 [{"id":3,"container_id":2}]

    https://dbfiddle.uk/h0HuyzB5

    You'll notice two key changes from your attempt:

    1. We are selecting and grouping on container_id in your subquery since that is your join criteria and desired level of granularity.
    2. We supply the table name (or alias) to the aggregation function so that all columns are included in the array aggregation.

    Lastly, both of these can be rewritten without the subquery:

    SELECT c.id, json_agg(e)
    FROM containers c
    INNER JOIN elements e
        ON c.id = e.container_id
    GROUP BY c.id