Search code examples
sqlpostgresqlpivotcrosstab

Pivot table with multiple value columns


I have a Postgres table with products data from different manufacturers, here the simplified table structure:

CREATE TABLE test_table (
  sku               text,
  manufacturer_name text,
  price             double precision,
  stock             int
);

INSERT INTO test_table
VALUES ('sku1', 'Manufacturer1', 110.00, 22),
       ('sku1', 'Manufacturer2', 120.00, 15),
       ('sku1', 'Manufacturer3', 130.00, 1),
       ('sku1', 'Manufacturer3', 30.00, 11),
       ('sku2', 'Manufacturer1', 10.00, 2),
       ('sku2', 'Manufacturer2', 9.00,  3),
       ('sku3', 'Manufacturer2', 21.00, 3),
       ('sku3', 'Manufacturer2', 1.00, 7),
       ('sku3', 'Manufacturer3', 19.00, 5);

I need to output each Manufacturer for each sku but if there are several identical manufacturers for the same sku I need to select the Manufacturer with the lowest price (note that I also need to include 'stock' column), here desired results:

| sku  | man1_price | man1_stock | man2_price | man2_stock | man3_price | man3_stock |
|------|------------|------------|------------|------------|------------|------------|
| sku1 | 110.0      | 22         | 120.0      | 15         | 30.0       | 11         |
| sku2 | 10.0       | 2          | 9.0        | 3          |            |            |
| sku3 |            |            | 1.0        | 7          | 19.0       | 5          |

I tried to use Postgres crosstab():

SELECT *
FROM crosstab('SELECT sku, manufacturer_name, price
              FROM test_table
              ORDER BY 1,2',
              $$ SELECT DISTINCT manufacturer_name FROM test_table ORDER BY 1 $$
       )
       AS ct (sku text, "man1_price" double precision,
              "man2_price" double precision,
              "man3_price" double precision
    );

But this produces a table with only one price column. And I didn't find a way to include the stock column.

I also tried to use conditional aggregation:

SELECT sku,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer1' THEN price END) as man1_price,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer1' THEN stock END) as man1_stock,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer2' THEN price END) as man2_price,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer2' THEN stock END) as man2_stock,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer3' THEN price END) as man3_price,
   MIN(CASE WHEN manufacturer_name = 'Manufacturer3' THEN stock END) as man3_stock
FROM test_table
GROUP BY sku
ORDER BY sku

And this query also doesn't work in my case - it simply selects min stock level - but if there are few the same Manufacturers for the same sku but with different prices/stocks - this query select min price from one manufacturer and min stock from another.

How can I output each manufacturer's price and corresponding stock from that table?

P.S. Thank you all for such helpful answers. My Postgres table is rather small - there no more than 15k of products, (I don't know if such numbers can be useful for proper comparing) but since Erwin Brandstetter asked to compare different queries performance I ran 3 queries with EXPLAIN ANALYZE, here is their execution time:

Erwin Brandstetter query:        400 - 450 ms 
Kjetil S query:                  250 - 300 ms
Gordon Linoff query:             200 - 250 ms
a_horse_with_no_name query:      250 - 300 ms

Again - I'm not sure if those numbers can be useful as a reference. For my case, I chose the combined version of Kjetil S and Gordon Linoff queries but Erwin Brandstetter and a_horse_with_no_name variants are also very useful and interesting. It's worth noting that if my table in the future would end up having more then few Manufacturers - adjusting query and typing their names each time would be tiresome - and hence the query from a_horse_with_no_name answer would be the most convenient to use.


Solution

  • Your last select almost works. But you should add a where condition where rows with non-minimum prices per sku per manufacturer are removed. This produces your expected result:

    select
      sku,
      min( case when manufacturer_name='Manufacturer1' then price end ) man1_price,
      min( case when manufacturer_name='Manufacturer1' then stock end ) man1_stock,
      min( case when manufacturer_name='Manufacturer2' then price end ) man2_price,
      min( case when manufacturer_name='Manufacturer2' then stock end ) man2_stock,
      min( case when manufacturer_name='Manufacturer3' then price end ) man3_price,
      min( case when manufacturer_name='Manufacturer3' then stock end ) man3_stock
    from test_table t
    where not exists (
        select 1 from test_table
        where sku=t.sku
        and manufacturer_name=t.manufacturer_name
        and price<t.price
    )
    group by sku
    order by 1;