I have the following table:
DROP TABLE IF EXISTS #df
CREATE TABLE #df
(
CommID VARCHAR(10),
ProvID VARCHAR(5),
VisitCount int,
[% Score] INT,
TimePeriod VARCHAR(10),
Median_VisitCount INT,
Average_VisitCount INT
);
INSERT INTO #df (CommID, ProvID, VisitCount, [% Score], TimePeriod, Median_VisitCount, Average_VisitCount)
VALUES
('AB345', '001', '65', .45, 'ThisYear', 48.5, 42),
('AB345', '002', '67', .64, 'ThisYear', 48.5, 42),
('AB345', '003', '32', .78, 'ThisYear', 48.5, 42),
('AB345', '004', '4', .32, 'ThisYear', 48.5, 42),
('AB345', '001', '23', .45, 'LastYear', 42.5, 41),
('AB345', '002', '56', .64, 'LastYear', 48.5, 41),
('AB345', '003', '31', .78, 'LastYear', 48.5, 41),
('AB345', '004', '54', .32, 'LastYear', 48.5, 41)
SELECT * FROM #df
And would like my final output to be like this:
DROP TABLE IF EXISTS #final
CREATE TABLE #final
(
CommID VARCHAR(10),
ProvID VARCHAR(5),
VisitCount int,
[% Score] INT,
TimePeriod VARCHAR(10),
Median_VisitCount INT,
Average_VisitCount INT,
Highest INT,
Lowest INT
);
INSERT INTO #final (CommID, ProvID, VisitCount, [% Score], TimePeriod, Median_VisitCount, Average_VisitCount, Highest, Lowest)
VALUES
('AB345', '001', '65', .45, 'ThisYear', 48.5, 42, 3, 1),
('AB345', '002', '67', .64, 'ThisYear', 48.5, 42, 2, 2),
('AB345', '003', '32', .78, 'ThisYear', 48.5, 42, 1, 3),
('AB345', '004', '4', .32, 'ThisYear', 48.5, 42, NULL, NULL),
('AB345', '001', '23', .45, 'LastYear', 42.5, 41, NULL, NULL),
('AB345', '002', '56', .64, 'LastYear', 48.5, 41, 1, 2),
('AB345', '003', '31', .78, 'LastYear', 48.5, 41, NULL, NULL),
('AB345', '004', '54', .32, 'LastYear', 48.5, 41, 2, 1)
SELECT * FROM #final
For a given CommID and TimePeriod, I would like to rank the [% Score] but only for [VisitCounts] >= Average_VisitCount.
This is the code that I have written but the ranking is still considering the values that are below the Average_VisitCount. I would like any rows with visitcounts less than the AverageVisit count to not be considered in the ranking:
SELECT a.CommID
, a.ProvID
, a.VisitCount
, a.[% Score]
, CASE WHEN VisitCount >= a.Average_VisitCount
THEN RANK() OVER (PARTITION BY a.CommID, TimePeriod ORDER BY [a].[% Score] DESC)
ELSE NULL END AS Highest
, CASE WHEN VisitCount >= a.Average_VisitCount
THEN RANK() OVER (PARTITION BY a.CommID, TimePeriod ORDER BY [a].[% Score])
ELSE NULL END AS Lowest
, a.TimePeriod
, a.Median_VisitCount
, a.Average_VisitCount
FROM #df a
ORDER BY a.CommID, a.TimePeriod, a.VisitCount DESC
Your demo data specified [% Score]
as an INT
, so I changed it to DECIMAL(5,2)
, otherwise every row scored 0
.
You're most of the way there. Basically, rank those that fall out of your range at the bottom so they don't interfere with the ranking you're interested in, and either leave them at the bottom of the ranking, or cause them to display a NULL:
SELECT *, CASE WHEN VisitCount >= Average_VisitCount THEN DENSE_RANK() OVER (PARTITION BY CommID, TimePeriod ORDER BY CASE WHEN VisitCount >= Average_VisitCount THEN [% Score] ELSE 999 END) END
FROM #df
The inner CASE
expression ranks them using the value of 999 (some arbitrary out of range value), the outer CASE
expression causes the column to return NULL for those values.
CommID | ProvID | VisitCount | % Score | TimePeriod | Median_VisitCount | Average_VisitCount | Ranking |
---|---|---|---|---|---|---|---|
AB345 | 004 | 54 | 0.32 | LastYear | 48 | 41 | 1 |
AB345 | 002 | 56 | 0.64 | LastYear | 48 | 41 | 2 |
AB345 | 003 | 31 | 0.78 | LastYear | 48 | 41 | |
AB345 | 001 | 23 | 0.45 | LastYear | 42 | 41 | |
AB345 | 001 | 65 | 0.45 | ThisYear | 48 | 42 | 1 |
AB345 | 002 | 67 | 0.64 | ThisYear | 48 | 42 | 2 |
AB345 | 003 | 32 | 0.78 | ThisYear | 48 | 42 | |
AB345 | 004 | 4 | 0.32 | ThisYear | 48 | 42 |