Search code examples
sqlpostgresqltable-functions

How to include null values in `tablefunc` query in postgresql?


I'm trying to use the crosstab function in postgresql to create a pivot table. However, I'm having difficulty understanding how to structure my SQL within the query. My data consists of four columns and looks like this:

enter image description here

I create this table using the following code:

CREATE TABLE ct(id SERIAL, zone_id int, group_id int, area double precision);
INSERT INTO ct(zone_id, group_id, area) VALUES(1,2,6798.50754160784);
INSERT INTO ct(zone_id, group_id, area) VALUES(1,3,10197.7613124118);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,1,85708.8676744647);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,2,56006.5971338327);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,3,5584.33145616642);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,5,8611.99732832252);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,6,36103.5509183704);
INSERT INTO ct(zone_id, group_id, area) VALUES(2,8,9801.14541428806);
INSERT INTO ct(zone_id, group_id, area) VALUES(5,1,45796.0020793546);

And following the postgresql documentation closely, I use the following code in my crosstab query:

SELECT *
FROM crosstab(
  'select zone_id, group_id, area
   from ct
   ')
AS ct(row_name integer, 
      g_1 double precision, 
      g_2 double precision, 
      g_3 double precision, 
      g_4 double precision, 
      g_5 double precision, 
      g_6 double precision, 
      g_7 double precision, 
      g_8 double precision);

This results in the following table which is not what I want them to be:

enter image description here

For example, in row two, I want the following values:

85708.8676744647, 56006.5971338327, 5584.33145616642, NULL, 8611.99732832252, 36103.5509183704, NULL, 9801.14541428806

Instead the values are:

85708.8676744647, 56006.5971338327, 5584.33145616642, 8611.99732832252, 36103.5509183704, 9801.14541428806

However, it seems that the null values are ignored, so that my column names g1 to g8, do not correspond to the original groups.


Solution

  • Use the crosstab() variant with two parameters:

    SELECT * FROM crosstab(
       'SELECT zone_id, group_id, area
        FROM   ct
        ORDER  BY 1,2'
    
       ,'SELECT g FROM generate_series(1,8) g'  -- ! Provide values explicitly
       )
    AS ct(
         row_name integer
       , g_1 float8, g_2 float8
       , g_3 float8, g_4 float8
       , g_5 float8, g_6 float8
       , g_7 float8, g_8 float8);
    

    Thereby declaring explicitly which value goes in which output column. So the function knows where to fill in NULL values. In this case generate_series() comes in handy to provide 8 rows with the numbers 1-8. A VALUES expression would be an alternative:

    'VALUES (1), (2), (3), (4), (5), (6), (7), (8)'
    

    Also, don't forget the ORDER BY clause in the first parameter query.

    I provided a detailed explanation in this related answer.