Search code examples
sqlpostgresqlaggregate-functionswindow-functions

Aggregating all values not in the same group


Is there a way in PostgreSQL to take this table:

ID country name values
1 USA John Smith {1,2,3}
2 USA Jane Smith {0,1,3}
3 USA Jane Doe {1,1,1}
4 USA John Doe {0,2,4}

and generate this table from it with the column agg_values:

ID country name values agg_values
1 USA John Smith {1,2,3} {0,1,3,1,1,1,0,2,4}
2 USA Jane Smith {0,1,3} {1,2,3,1,1,1,0,2,4}
3 USA Jane Doe {1,1,1} {1,2,3,0,1,3,0,2,4}
4 USA John Doe {0,2,4} {1,2,3,0,1,3,1,1,1}

Where each row aggregates all values except from the current row and its peers.
So if name = John Smith then agg_values = aggregate of all values where name not = John Smith. Is that possible?


Solution

  • In Postgres 11 or later, use a window function with a custom frame and a frame_exclusion:

    SELECT *, array_combine(values) OVER (ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND UNBOUNDED FOLLOWING
                                               EXCLUDE CURRENT ROW) AS agg_values
    FROM   tbl;
    

    If name is not UNIQUE, and since you asked:

    all values where name not = John Smith

    SELECT *, array_combine(values) OVER (ORDER BY name
                                          ROWS BETWEEN UNBOUNDED PRECEDING
                                               AND UNBOUNDED FOLLOWING
                                               EXCLUDE GROUP) AS agg_values
    FROM   tbl;
    

    db<>fiddle here

    The first one (also) works with arbitrary order of rows, only excluding the current one. The second requires ORDER BY to establish which rows are in the same group.

    The manual:

    The frame_exclusion option allows rows around the current row to be excluded from the frame, even if they would be included according to the frame start and frame end options. EXCLUDE CURRENT ROW excludes the current row from the frame. EXCLUDE GROUP excludes the current row and its ordering peers from the frame. EXCLUDE TIES excludes any peers of the current row from the frame, but not the current row itself. [...]

    Bold emphasis mine.

    This uses the custom aggregate function array_combine(anyarray) provided by a_horse.
    Or here: