Search code examples
postgresqljoinpivotsubqueryinner-join

How can I combine the two select queries on the same table horizontally in Postgresql?


everyone. I am a beginner of Postgresql. Recently I met with one question. I have one table named 'sales'.

create table sales
    (
        cust    varchar(20),
        prod    varchar(20),
        day integer,
        month   integer,
        year    integer,
        state   char(2),
        quant   integer
    );
insert into sales values ('Bloom', 'Pepsi', 2, 12, 2001, 'NY', 4232);
insert into sales values ('Knuth', 'Bread', 23, 5, 2005, 'PA', 4167);
insert into sales values ('Emily', 'Pepsi', 22, 1, 2006, 'CT', 4404);
insert into sales values ('Emily', 'Fruits', 11, 1, 2000, 'NJ', 4369);
insert into sales values ('Helen', 'Milk', 7, 11, 2006, 'CT', 210);
insert into sales values ('Emily', 'Soap', 2, 4, 2002, 'CT', 2549);
insert into sales values ('Bloom', 'Eggs', 30, 11, 2000, 'NJ', 559);

.... There are 498 rows in total. Here is the overview of this table:

enter image description here

Now I want to compute the maximum and minimum sales quantities for each product, along with their corresponding customer (who purchased the product), dates (i.e., dates of those maximum and minimum sales quantities) and the state in which the sale transaction took place. And the average sales quantity for the corresponding products.

The combined one should be like this:

enter image description here

It should have 10 rows because there are 10 distinct products in total.

I have tried:

select prod,
       max(quant),
       cust as MAX_CUST
from sales
group by prod;

but it returned an error and said the cust should be in the group by. But I only want to classify by the type of product.

What's more, how can I horizontally combine the max_q and its customer, date, state with min_q and its customer, date, state and also the AVG_Q by their product name? I feel really confused!


Solution

  • You can use analytic function ROW_NUMBER to rank records by increasing/decreasing sales for each product in a subquery, and then do conditional aggregation:

    SELECT
        prod product,
        MAX(CASE WHEN rn2 = 1 THEN quant END) max_quant,
        MAX(CASE WHEN rn2 = 1 THEN cust END) max_cust,
        MAX(CASE WHEN rn2 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) max_date,
        MAX(CASE WHEN rn2 = 1 THEN state END) max_state,
        MAX(CASE WHEN rn1 = 1 THEN quant END) min_quant,
        MAX(CASE WHEN rn1 = 1 THEN cust END) min_cust,
        MAX(CASE WHEN rn1 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) min_date,
        MAX(CASE WHEN rn1 = 1 THEN state END) min_state,
        avg_quant
    FROM (
        SELECT
            s.*,
            ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant) rn1,
            ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant DESC) rn2,
            AVG(quant) OVER(PARTITION BY prod) avg_quant
        FROM sales s
    ) x
    WHERE rn1 = 1 OR rn2 = 1
    GROUP BY prod, avg_quant