Search code examples
postgresqlpivot-tablecrosstabpostgres-crosstab

2 Level pivot using Postgresql


I have a table whose schema along with data (table_name : raw_data) appears to be this :

name | category | clear_date |

A    | GOOD     | 2020-05-30 |
A    | GOOD     | 2020-05-30 |
A    | GOOD     | 2020-05-30 |
A    | GOOD     | 2020-05-30 |
A    | BAD      | 2020-05-30 |
A    | BAD      | 2020-05-30 |

Now if I perform a "groupby" operation using the following statement :

 SELECT name, category, date(clear_date), count(clear_date)
 FROM raw_data
 GROUP BY name, category, date(clear_date)
 ORDER BY name

I get the following answer :

  name | caetgory | date      | count |
  A    | GOOD     |2020-05-30 |  4    |
  A    | BAD      |2020-05-30 |  1    |
  A    | BAD      |2020-05-31 |  1    |

IN order to produce the pivot in following format :

 name | category | 2020-05-30 | 2020-05-31 |
 A    | GOOD     | 4          |   NULL     | 
 A    | BAD      | 1          |   1        |

I am using the following query :

 select * from crosstab (
    'select name, category, date(clear_date), count(clear_date) from raw_data group by name, category, date(clear_date) order by 1,2,3',
    'select distinct date(clear_date) from raw_data order by 1'
    )
    as newtable (
    node_name varchar, alarm_name varchar, "2020-05-30" integer, "2020-05-31" integer
)
ORDER BY name

But I am getting results as follows :

    name | category | 2020-05-30 | 2020-05-31 |
    A    | BAD      |    4       | 1          |

Can anyone please try to suggest how can i achieve the result mentioned above. It appears crosstab removes the duplicate entry of A automatically.


Solution

  • Not sure if this is possible using crosstab because you have a missing records in some dates. Here is an example how to get expected result but not sure is what you need. Anyway hope this helps.

    SELECT r1.*, r2.counter AS "2020-05-30", r3.counter AS "2020-05-31"
    FROM (
        SELECT DISTINCT name, category
        FROM raw_data
    ) AS r1
    LEFT JOIN (
        SELECT name, category, count(*) AS counter
          FROM raw_data
         WHERE clear_date = '2020-05-30'
         GROUP BY name, category
    ) AS r2 ON (r2.category = r1.category AND r2.name = r1.name)
    LEFT JOIN (
        SELECT name, category, count(*) AS counter
          FROM raw_data
         WHERE clear_date = '2020-05-31'
         GROUP BY name, category
    ) AS r3 ON (r3.category = r1.category AND r3.name = r1.name)
    ORDER BY r1.category DESC;