sql postgresql aggregate-functions aggregate-filter

Aggregate columns with additional (distinct) filters

This code works as expected, but I it's long and creepy.

select p.name, p.played, w.won, l.lost from

(select users.name, count(games.name) as played
from users
inner join games on games.player_1_id = users.id
where games.winner_id > 0
group by users.name
union
select users.name, count(games.name) as played
from users
inner join games on games.player_2_id = users.id
where games.winner_id > 0
group by users.name) as p

inner join

(select users.name, count(games.name) as won
from users
inner join games on games.player_1_id = users.id
where games.winner_id = users.id
group by users.name
union
select users.name, count(games.name) as won
from users
inner join games on games.player_2_id = users.id
where games.winner_id = users.id
group by users.name) as w on p.name = w.name

inner join

(select users.name, count(games.name) as lost
from users
inner join games on games.player_1_id = users.id
where games.winner_id != users.id
group by users.name
union
select users.name, count(games.name) as lost
from users
inner join games on games.player_2_id = users.id
where games.winner_id != users.id
group by users.name) as l on l.name = p.name

As you can see, it consists of 3 repetitive parts for retrieving:

player name and the amount of games they played
player name and the amount of games they won
player name and the amount of games they lost

And each of those also consists of 2 parts:

player name and the amount of games in which they participated as player_1
player name and the amount of games in which they participated as player_2

How could this be simplified?

The result looks like so:

           name            | played | won | lost 
---------------------------+--------+-----+------
 player_a                  |      5 |   2 |    3
 player_b                  |      3 |   2 |    1
 player_c                  |      2 |   1 |    1

Solution

Postgres 9.4 or newer

Use the standard-SQL aggregate FILTER clause:

SELECT u.name
     , count(*) FILTER (WHERE g.winner_id  > 0)    AS played
     , count(*) FILTER (WHERE g.winner_id  = u.id) AS won
     , count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

Only rows that pass the boolean expression in the FILTER clause contribute to the aggregate.

Any Postgres version

SELECT u.name
     , count(g.winner_id  > 0 OR NULL)    AS played
     , count(g.winner_id  = u.id OR NULL) AS won
     , count(g.winner_id <> u.id OR NULL) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

Older versions need a workaround. This is shorter and faster than nested sub-selects or CASE expressions. See:

For absolute performance, is SUM faster or COUNT?