Search code examples
sqloracle11gpivot-tablerollup

How to use ROLLUP, RANK() with pivot table in Oracle11g


Table schema

CREATE TABLE customer ( 
 id          NUMERIC, 
 lname       VARCHAR (30), 
 fname       VARCHAR (30) NOT NULL, 
 street      VARCHAR (30) NOT NULL, 
 city        VARCHAR (30) NOT NULL, 
 zipcode     NUMERIC (5) NOT NULL, 
 state       VARCHAR (2) NOT NULL, 
 phone       VARCHAR (12) NOT NULL, 
 creditscore NUMERIC, 
 credit_org  VARCHAR (30), 
 cs_date     DATE, 
 CONSTRAINT customer_pk PRIMARY KEY (id) 
);

Requirement:

Part 1: Create a pivot table to list the number of customers by location ('PA', 'CA', 'NY', 'MD') and also by creditscore range. For creditscore range, create 3 segments, 'LOWER RANGE(500-600)' defined as those with credit score between 500-600, 'AVERAGE RANGE(600-700)' defined as those with credit score between 600-700, 'PREMIUM RANGE(700+)' defined as those with credit score of 700+. The pivot table result should include 4 columns for states and 3 rows for creditscore range.

Part 2: Add a total row to give total no of customers for respective states Rank the states based on the total no of customers.

Extra requirement Rank the states based on the total no of customers first in premium range, then average range and then lower range.

My Take So far

SELECT * FROM (
  SELECT case
           when CREDITSCORE between 500 and 600 then 'LOWER RANGE(500-600)'
           when CREDITSCORE <= 700 then 'AVERAGE RANGE(600-700)'
           else 'PREMIUM RANGE(700+)'
    end as CREDITSCORE_RANGE,
    state
  FROM customer
) 
PIVOT (
  count(state) FOR state IN ('PA', 'CA', 'NY', 'MD')
);

It is correct to the Part 1

My question is How to tackle Part 2 and Is it even possible?

Edit 1:

MY take on above using DECODE

SELECT DECODE(case when CREDITSCORE <= 600 then 'LOWER RANGE(500-600)' when CREDITSCORE <= 700 then 'AVERAGE RANGE(600-700)' else 'PREMIUM RANGE(700+)' end, NULL, 'TOTAL',case when CREDITSCORE between 500 and 600 then 'LOWER RANGE(500-600)' when CREDITSCORE <= 700 then 'AVERAGE RANGE(600-700)' else 'PREMIUM RANGE(700+)' end) "CREDITSCORE_RANGE", SUM(DECODE(state, 'PA', 1, 0)) "Pennsylvania", SUM(DECODE(state, 'CA', 1, 0)) "California", SUM(DECODE(state, 'NY', 1, 0)) "New York", SUM(DECODE(state, 'MD', 1, 0)) "Maryland", count(CREDITSCORE) "Total No of Customers", RANK() OVER (ORDER BY Count(CREDITSCORE) DESC) "RANK BY NO OF CUSTOMERS" FROM customer GROUP BY ROLLUP(case when CREDITSCORE between 500 and 600 then 'LOWER RANGE(500-600)' when CREDITSCORE <= 700 then 'AVERAGE RANGE(600-700)' else 'PREMIUM RANGE(700+)' end);


Solution

  • I don't think your ranges are quite right for the query you've written, though the wording of the assignment is ambiguous as 'between' is inclusive - so as the question is worded, a credit score of exactly 600 would appear in both the 'lower' and 'average' brackets. Your version will put 600 in the 'lower' bracket, but it's debatable which it should be in; I'd think it should be 'average' from the other definitions, but it's unclear. There is no bracket in the question for scores less than 500, but if you have any of those then your current code will include them in the 'average' bracket, since they are less than 700 but not between 500 and 600.

    So I'd have interpreted that as:

    SELECT * FROM (
      SELECT case
               when CREDITSCORE >= 500 and CREDITSCORE < 600 then 'LOWER RANGE(500-600)'
               when CREDITSCORE >= 600 and CREDITSCORE < 700 then 'AVERAGE RANGE(600-700)'
               when CREDITSCORE >= 700 then 'PREMIUM RANGE(700+)'
        end as CREDITSCORE_RANGE,
        state
      FROM customer
    ) 
    PIVOT (
      count(state) FOR state IN ('PA' as pa, 'CA' as ca, 'NY' as ny, 'MD' as md)
    );
    

    Your question title refers to ROLLUP, and to get the total row you can use that function:

    SELECT creditscore_range, sum(pa) AS pa, sum(ca) AS ca, sum(ny) AS ny, sum(md) AS md
    FROM (
      SELECT * FROM (
        SELECT CASE
                 WHEN creditscore >= 500 AND creditscore < 600 THEN 'LOWER RANGE(500-600)'
                 WHEN creditscore >= 600 AND creditscore < 700 THEN 'AVERAGE RANGE(600-700)'
                 WHEN creditscore >= 700 THEN 'PREMIUM RANGE(700+)'
          END AS creditscore_range,
          state
        FROM customer
      ) 
      PIVOT (
        COUNT(state) FOR state IN ('PA' AS pa, 'CA' AS ca, 'NY' AS ny, 'MD' AS md)
      )
    )
    GROUP BY ROLLUP (creditscore_range);
    

    If you do have any scores below 500 then Both will include a line for those with the creditscore_range as null; which is confusing with the ROLLUP version. You may want to filter any scores less than 500 out from the innermost query, but again it isn't clear if that is necessary or desirable.

    I'm not sure that the assignment is looking for when it talks about ranking though. That implies changing the column order based on the values they contain. Ranking by state would make more sense if the data was pivoted the other way.