Search code examples
sqlpostgresqlaggregate-functionsaggregate-filter

Aggregate columns with additional (distinct) filters


This code works as expected, but I it's long and creepy.

select p.name, p.played, w.won, l.lost from

(select users.name, count(games.name) as played
from users
inner join games on games.player_1_id = users.id
where games.winner_id > 0
group by users.name
union
select users.name, count(games.name) as played
from users
inner join games on games.player_2_id = users.id
where games.winner_id > 0
group by users.name) as p

inner join

(select users.name, count(games.name) as won
from users
inner join games on games.player_1_id = users.id
where games.winner_id = users.id
group by users.name
union
select users.name, count(games.name) as won
from users
inner join games on games.player_2_id = users.id
where games.winner_id = users.id
group by users.name) as w on p.name = w.name

inner join

(select users.name, count(games.name) as lost
from users
inner join games on games.player_1_id = users.id
where games.winner_id != users.id
group by users.name
union
select users.name, count(games.name) as lost
from users
inner join games on games.player_2_id = users.id
where games.winner_id != users.id
group by users.name) as l on l.name = p.name

As you can see, it consists of 3 repetitive parts for retrieving:

  • player name and the amount of games they played
  • player name and the amount of games they won
  • player name and the amount of games they lost

And each of those also consists of 2 parts:

  • player name and the amount of games in which they participated as player_1
  • player name and the amount of games in which they participated as player_2

How could this be simplified?

The result looks like so:

           name            | played | won | lost 
---------------------------+--------+-----+------
 player_a                  |      5 |   2 |    3
 player_b                  |      3 |   2 |    1
 player_c                  |      2 |   1 |    1

Solution

  • Postgres 9.4 or newer

    Use the standard-SQL aggregate FILTER clause:

    SELECT u.name
         , count(*) FILTER (WHERE g.winner_id  > 0)    AS played
         , count(*) FILTER (WHERE g.winner_id  = u.id) AS won
         , count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
    FROM   games g
    JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
    GROUP  BY u.name;
    

    Only rows that pass the boolean expression in the FILTER clause contribute to the aggregate.

    Any Postgres version

    SELECT u.name
         , count(g.winner_id  > 0 OR NULL)    AS played
         , count(g.winner_id  = u.id OR NULL) AS won
         , count(g.winner_id <> u.id OR NULL) AS lost
    FROM   games g
    JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
    GROUP  BY u.name;
    

    Older versions need a workaround. This is shorter and faster than nested sub-selects or CASE expressions. See: