How do I create crosstab queries in PostgreSQL? For example I have the following table:
Section Status Count
A Active 1
A Inactive 2
B Active 4
B Inactive 5
I would like the query to return the following crosstab:
Section Active Inactive
A 1 2
B 4 5
Install the additional module tablefunc
once per database, which provides the function crosstab()
. Since Postgres 9.1 you can use CREATE EXTENSION
for that:
CREATE EXTENSION IF NOT EXISTS tablefunc;
CREATE TABLE tbl (
section text
, status text
, ct integer -- "count" is a reserved word in standard SQL
);
INSERT INTO tbl VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7); -- ('C', 'Active') is missing
crosstab(text)
with 1 input parameter:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- needs to be "ORDER BY 1,2" here
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive ---------+--------+---------- A | 1 | 2 B | 4 | 5 C | 7 | -- !!
C
: the value 7
is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.crosstab(text, text)
with 2 input parameters:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- could also just be "ORDER BY 1" here
, $$VALUES ('Active'::text), ('Inactive')$$
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive ---------+--------+---------- A | 1 | 2 B | 4 | 5 C | | 7 -- !!
Note the correct result for C
.
The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:
'SELECT DISTINCT attribute FROM tbl ORDER BY 1'
That's in the manual.
Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN()
variants), it is typically more efficient to provide a short list in a VALUES
expression like demonstrated:
$$VALUES ('Active'::text), ('Inactive')$$
Or (not in the manual):
$$SELECT unnest('{Active,Inactive}'::text[])$$ -- short syntax for long lists
I used dollar quoting to make quoting easier.
You can even output columns with different data types with crosstab(text, text)
- as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text
, date
, numeric
etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text)
in the manual.
db<>fiddle here
Excess input rows are handled differently - duplicate rows for the same ("row_name", "category") combination - (section, status)
in the above example.
The 1-parameter form fills in available value columns from left to right. Excess values are discarded.
Earlier input rows win.
The 2-parameter form assigns each input value to its dedicated column, overwriting any previous assignment.
Later input rows win.
Typically, you don't have duplicates to begin with. But if you do, carefully adjust the sort order to your requirements - and document what's happening.
Or get fast arbitrary results if you don't care. Just be aware of the effect.
Pivot on Multiple Columns using Tablefunc - also demonstrating mentioned "extra columns"
\crosstabview
in psqlPostgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab()
parameter and feed it to \crosstabview
(immediately or in the next step). Like:
db=> SELECT section, status, ct FROM tbl \crosstabview
Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY
is not required. Details for \crosstabview
in the manual. There are more code examples at the bottom of that page.
Related answer on dba.SE by Daniel Vérité (the author of the psql feature):