I have a table as follows and I want to run a script to return a single PLAN_ID value for each CARE_ID and these are the conditions. How would I do this? I'm writing the script in SQL Server 2005 but it needs to be backwards compatible with SQL Server 2000.
CARE_ID
has only one PLAN_ID
then return that PLAN_ID
CARE_ID
has multiple PLAN_ID
s and the value of FIRST_TREATMENT_DATE
is NULL
then return the highest value PLAN_ID
CARE_ID
has multiple PLAN_ID
s and the value of FIRST_TREATMENT_DATE
is not NULL
then return the PLAN_ID
which has the most recent date for N5_2_MDT_DATE
My calculations say that the script would return the PLAN_ID
values of
1833,65,162,2929,67,93,44,1136,1046,1047,1048,1049,1050,1052,1870,2426
Thank you
PLAN_ID CARE_ID N5_2_MDT_DATE FIRST_TREATMENT_DATE
1833 1 20/08/2011 00:00 NULL
199 1 23/06/2010 00:00 NULL
65 4 27/11/2009 00:00 NULL
162 5 30/07/2010 00:00 NULL
54 5 15/11/2009 00:00 NULL
55 5 29/10/2009 00:00 NULL
63 5 03/09/2009 00:00 NULL
2929 9 29/01/2013 00:00 NULL
99 9 08/03/2010 00:00 NULL
95 9 04/02/2010 00:00 NULL
64 9 18/11/2009 00:00 NULL
67 106 14/01/2013 00:00 NULL
96 106 20/07/2009 00:00 NULL
93 107 23/02/2010 00:00 21/09/2012 00:00
44 108 25/12/2009 00:00 NULL
43 108 07/10/2009 00:00 NULL
1136 364 18/02/2011 00:00 19/02/2011 00:00
1122 364 26/01/2011 00:00 19/02/2011 00:00
1046 1661 25/01/2011 00:00 25/01/2011 00:00
1047 1662 25/01/2011 00:00 25/01/2011 00:00
1048 1663 25/01/2011 00:00 01/02/2011 00:00
1049 1665 25/01/2011 00:00 NULL
1050 1666 23/01/2011 00:00 01/02/2011 00:00
1052 1667 01/02/2011 00:00 01/02/2011 00:00
1870 1781 04/10/2010 00:00 10/02/2011 00:00
1869 1781 04/10/2010 00:00 10/02/2011 00:00
1868 1781 04/10/2010 00:00 10/02/2011 00:00
2426 2246 23/03/2012 00:00 01/07/2012 00:00
2275 2246 01/01/2012 00:00 01/07/2012 00:00
2170 2246 14/10/2011 00:00 01/07/2012 00:00
1784 2246 04/08/2011 00:00 01/07/2012 00:00
1940 2246 10/07/2011 00:00 01/07/2012 00:00
1637 2246 20/06/2011 00:00 01/07/2012 00:00
1539 2246 02/06/2011 00:00 01/07/2012 00:00
1538 2246 01/06/2011 00:00 01/07/2012 00:00
1536 2246 31/05/2011 00:00 01/07/2012 00:00
I don't have a 2000 instance handy, but I think I've kept to compatible bits of language. I'm assuming that for the third condition, there won't be two plans with the same N5_2_MDT_DATE
.
I actually ran this as a single script with the data setup1 first and then the query, but am rearranging things so that the actual answer appears first:
select t1.CARE_ID,
CASE
--Cases one and two are identical, effectively
WHEN COUNT(*) = 1 OR MAX(t1.FIRST_TREATMENT_DATE) IS NULL
THEN MAX(t1.PLAN_ID)
ELSE MAX(CASE WHEN t1.N5_2_MDT_DATE = t2.LastDate THEN t1.PLAN_ID END)
END
from @t t1
inner join
(select CARE_ID,MAX(N5_2_MDT_DATE) as LastDate
from @t
group by CARE_ID
) t2
on t1.CARE_ID = t2.CARE_ID
group by t1.CARE_ID
You'll note that I've collapsed cases one and two since there's no harm in computing the highest plan number when there's only one plan to consider.
Also, we disagree on which plan to return for case 106, but I'm certain 96 is correct rather than 67, by the rules you've listed.
The logic is (mostly) all up in the CASE
expression in the select. If there's only a single row in the group or MAX(FIRST_TREATMENT_DATE)
across all rows is NULL
(which can only happen if all rows in the group have NULL
then we just return the MAX(PLAN_ID)
.
The more complex case is the third. To assist with this, I've got my subquery (t2
) which finds the highest N5_2_MDT_DATE
value for each CASE_ID
. We then use this in the ELSE
clause of the CASE
expression, inside another aggregate - where we try to ensure that we only actually consider a particular PLAN_ID
if its N5_2_MDT_DATE
column matches the highest one found by t2
- which should only happen once in each group, if the assumption I outlined above in the first paragraph holds.
For later versions of SQL Server, CTEs and the ROW_NUMBER()
function would make this substantially easier to write, I think.
1 Data setup of the table variable I used. This should appear first in the query window if you want to run the above query:
declare @t table (PLAN_ID int not null, CARE_ID int not null,
N5_2_MDT_DATE datetime not null,FIRST_TREATMENT_DATE datetime null)
insert into @t(PLAN_ID,CARE_ID,N5_2_MDT_DATE,FIRST_TREATMENT_DATE)
SELECT 1833, 1 ,'20110820',NULL union all
SELECT 199 , 1 ,'20100623',NULL union all
SELECT 65 , 4 ,'20091127',NULL union all
SELECT 162 , 5 ,'20100730',NULL union all
SELECT 54 , 5 ,'20091115',NULL union all
SELECT 55 , 5 ,'20091029',NULL union all
SELECT 63 , 5 ,'20090903',NULL union all
SELECT 2929, 9 ,'20130129',NULL union all
SELECT 99 , 9 ,'20100308',NULL union all
SELECT 95 , 9 ,'20100204',NULL union all
SELECT 64 , 9 ,'20091118',NULL union all
SELECT 67 , 106 ,'20130114',NULL union all
SELECT 96 , 106 ,'20090720',NULL union all
SELECT 93 , 107 ,'20100223','20120921' union all
SELECT 44 , 108 ,'20091225',NULL union all
SELECT 43 , 108 ,'20091007',NULL union all
SELECT 1136, 364 ,'20110218','20110219' union all
SELECT 1122, 364 ,'20110126','20110219' union all
SELECT 1046, 1661 ,'20110125','20110125' union all
SELECT 1047, 1662 ,'20110125','20110125' union all
SELECT 1048, 1663 ,'20110125','20110201' union all
SELECT 1049, 1665 ,'20110125',NULL union all
SELECT 1050, 1666 ,'20110123','20110201' union all
SELECT 1052, 1667 ,'20110201','20110201' union all
SELECT 1870, 1781 ,'20101004','20110210' union all
SELECT 1869, 1781 ,'20101004','20110210' union all
SELECT 1868, 1781 ,'20101004','20110210' union all
SELECT 2426, 2246 ,'20120323','20120701' union all
SELECT 2275, 2246 ,'20120101','20120701' union all
SELECT 2170, 2246 ,'20111014','20120701' union all
SELECT 1784, 2246 ,'20110804','20120701' union all
SELECT 1940, 2246 ,'20110710','20120701' union all
SELECT 1637, 2246 ,'20110620','20120701' union all
SELECT 1539, 2246 ,'20110602','20120701' union all
SELECT 1538, 2246 ,'20110601','20120701' union all
SELECT 1536, 2246 ,'20110531','20120701'