Search code examples
sqlpostgresqlleft-joinaggregate-functionspostgresql-9.1

How to group by a, b and return set of N rows of b


Using Postgres 9.3.2, I want to get a count of req_status grouped by req_time and customer_id and return a set of n rows for each customer_id, even when req_status count is zero.

req_time     req_id   customer_id     req_status
----------------------------------------------- 
2014-03-19    100        1            'FAILED'
2014-03-19    102        1            'FAILED'
2014-03-19    105        1            'OK'
2014-03-19    106        2            'FAILED'
2014-03-20    107        1            'OK'
2014-03-20    108        2            'FAILED'
2014-03-20    109        2            'OK'
2014-03-20    110        1            'OK'

Output

req_time  customer_id   req_status  count
-------------------------------------------
2014-03-19    1            'FAILED'   2
2014-03-19    1            'OK'       1
2014-03-19    2            'FAILED'   1
2014-03-19    2            'OK'       0
2014-03-20    1            'FAILED'   0
2014-03-20    1            'OK'       2
2014-03-20    2            'FAILED'   1
2014-03-20    2            'OK'       1

How can I achieve this?


Solution

  • To also see missing rows in the result, LEFT JOIN to a complete grid of possible rows. The grid is built from all possible combinations of (req_time, customer_id, req_status):

    SELECT d.req_time, c.customer_id, s.req_status, count(t.req_time) AS ct
    FROM  (
       SELECT generate_series (min(req_time)
                             , max(req_time)
                             , '1 day')::date
       FROM   tbl
       ) d(req_time)
    CROSS  JOIN (SELECT DISTINCT customer_id FROM tbl)  c(customer_id)
    CROSS  JOIN (VALUES ('FAILED'::text), ('OK'))       s(req_status)
    LEFT   JOIN  tbl t USING (req_time, customer_id, req_status)
    GROUP  BY 1, 2, 3
    ORDER  BY 1, 2, 3;
    

    Count on a column from the actual table, which will be 0 if no match is found (null values don't count).

    Assuming req_time to be a date (not timestamp).

    Similar: