sql postgresql left-join aggregate-functions postgresql-9.1

How to group by a, b and return set of N rows of b

Using Postgres 9.3.2, I want to get a count of req_status grouped by req_time and customer_id and return a set of n rows for each customer_id, even when req_status count is zero.

req_time     req_id   customer_id     req_status
----------------------------------------------- 
2014-03-19    100        1            'FAILED'
2014-03-19    102        1            'FAILED'
2014-03-19    105        1            'OK'
2014-03-19    106        2            'FAILED'
2014-03-20    107        1            'OK'
2014-03-20    108        2            'FAILED'
2014-03-20    109        2            'OK'
2014-03-20    110        1            'OK'

Output

req_time  customer_id   req_status  count
-------------------------------------------
2014-03-19    1            'FAILED'   2
2014-03-19    1            'OK'       1
2014-03-19    2            'FAILED'   1
2014-03-19    2            'OK'       0
2014-03-20    1            'FAILED'   0
2014-03-20    1            'OK'       2
2014-03-20    2            'FAILED'   1
2014-03-20    2            'OK'       1

How can I achieve this?

Solution

To also see missing rows in the result, LEFT JOIN to a complete grid of possible rows. The grid is built from all possible combinations of (req_time, customer_id, req_status):

SELECT d.req_time, c.customer_id, s.req_status, count(t.req_time) AS ct
FROM  (
   SELECT generate_series (min(req_time)
                         , max(req_time)
                         , '1 day')::date
   FROM   tbl
   ) d(req_time)
CROSS  JOIN (SELECT DISTINCT customer_id FROM tbl)  c(customer_id)
CROSS  JOIN (VALUES ('FAILED'::text), ('OK'))       s(req_status)
LEFT   JOIN  tbl t USING (req_time, customer_id, req_status)
GROUP  BY 1, 2, 3
ORDER  BY 1, 2, 3;

Count on a column from the actual table, which will be 0 if no match is found (null values don't count).

Assuming req_time to be a date (not timestamp).

Similar:

array_agg group by and null