Search code examples
sqlpostgresqlcrosstabdata-analysis

Postgres crosstab and aggregate together


I have a table purchases like this:

purchase_id|purchase_date|product_id
     1     | 2016-02-04  | 17
     2     | 2016-02-04  | 19
     3     | 2016-02-04  | 17
     ... About 1 million entries

and a table products like this:

product_id
   17
   18
   19
   ... About 30 entries

I'd like to do a crosstab query to give me the total number of purchases on date, pivoted over each product_id, like this:

 purchase_date|product_id_17|product_id_18|product_id_19|product_id...
   2016-02-04 |   2         |  0          |  1          | ...
   2016-02-05 | ...

Note that the values in this table are not taken from any record, but the sum of the number of records that match. I believe this is possible using crosstab along with aggregates, but haven't been able to figure out how to do it.

  1. Can this be done using Postgres?
  2. Is crosstab the right way to do it?
  3. How?

What's unique here is that I am trying to:

  • Do a crosstab
  • Do an aggregate
  • At the same time

Note that the other questions on Stack Overflow discussing Crosstab do not explain how to do a Crosstab and aggregate together.


Solution

  • With following as sample

    Table-

     create table prods(purchase_id int,purchase_date date,product_id int);
    

    Data-

     insert into prods values (1,'2016-02-04',17);
     insert into prods values (2,'2016-02-04',19);
     insert into prods values (3,'2016-02-04',17);
     insert into prods values (4,'2016-02-05',17);
     insert into prods values (5,'2016-02-05',17);
    

    To get the desired output you can use crosstab()-

    select *
    from crosstab('
     select purchase_date
           ,product_id
           ,count(product_id)::int
    from    prods
    group by purchase_date
            ,product_id
    order by purchase_date
            ,product_id
    ') as t("purchase_date" date
            ,"product_id_17" int
            ,"product_id_19" int
            )
    

    Output-

    purchase_date product_id_17 product_id_19 
    ------------- ------------- ------------- 
    2016-02-04    2             1             
    2016-02-05    2             NULL