I have a query that should pivot a table with large number of values in a certain column that should be transformed into the column names of the pivoted table, i.e.,
SELECT *
FROM crosstab(
$$SELECT user_id, item_id, rating
FROM tbl
ORDER BY 1,2$$
) AS ct ("user_id" varchar(50), "select distinct item_id from tbl ORDER BY 1" varchar(50)[]);
This attempt results in an error: return and sql tuple descriptions are incompatible. I am sure that the type of user_id and item_id columns is varchar(50). However, I am not sure what should be the type of the second argument that is a query.
This query has been generated following the example given in the first answer at PostgreSQL Crosstab Query. It mentions that the second argument can be a query.
Any help to fix this problem would be great. Thank you.
Edit:
For the purpose of providing MNE, the table tbl can be created by:
CREATE TABLE tbl (
user_id varchar(50)
, item_id varchar(50)
, rating integer
);
INSERT INTO tbl VALUES
('A', 'item_0', 1), ('A', 'item_3', 2)
, ('B', 'item_1', 4), ('B', 'item_2', 5)
, ('C', 'item_0', 3);
The desired outcome is:
user_id item_0 item_1 item_2 item_3
A 1 2
B 4 5
C 3
Note that, in the actual scenario, the number of distinct values in the item_id column of table tbl is about 2000.
crosstab
is an overloaded function and it seems that the version you're after is the one that takes two string arguments: crosstab ( source_sql text, category_sql text ) → setof record
.
So your query should look something like (I can't test it since your questions doesn't provide a sample input and output):
select * from crosstab(
-- 1st query: source_sql
'SELECT user_id, item_id, rating FROM tbl ORDER BY 1,2',
-- 2nd query: category_sql
'select distinct item_id from tbl ORDER BY 1'
) AS (user_id varchar(50), ...);
The two query strings are actually arguments to the crosstab
call, not part of the specifier situated after the AS
keyword.
Edit: Here's a version of the query that works on your sample data, but it also shows that the number of categories needs to be fixed:
select * from crosstab(
'select * from tbl',
'select distinct item_id from tbl order by item_id asc limit 4'
) as (
user_id varchar(50),
-- The number of items here must match the LIMIT clause above and you must
-- make sure that the query will indeed return 4 item_id's.
item_1 varchar(50),
item_2 varchar(50),
item_3 varchar(50),
item_4 varchar(50)
);