I am using SQL Server.
Here is my table:
RepID ICD Diagnosed RandStat
-------------------------------
1001 A 1 11
1002 A 0 11
1003 A 0 11
1005 A 1 11
1003 A 1 22
1005 A 0 22
1008 A 1 22
1009 A 0 22
1001 A 0 33
1004 A 1 33
1005 A 0 33
1007 A 0 33
1003 A 0 44
1004 A 0 44
1005 A 1 44
1007 A 1 44
1001 B 0 11
1004 B 1 11
1006 B 1 11
1008 B 1 11
1004 B 0 22
1006 B 1 22
1008 B 0 22
1009 B 1 22
1001 B 0 33
1002 B 1 33
1003 B 1 33
1006 B 0 33
1003 B 1 44
1004 B 0 44
1005 B 0 44
1007 B 1 44
I want to filter out the repeated RepIDs for the same ICD based on the first RandStat (ordered by RandStat
).
So if RepID appeared on RandStat 11, then it should be removed from RandStat 22 or above (all for the same ICD)
The final results should look like this:
RepID ICD Diagnosed RandStat
-------------------------------
1001 A 1 11
1002 A 0 11
1003 A 0 11
1005 A 1 11
1008 A 1 22
1009 A 0 22
1004 A 1 33
1007 A 0 33
1001 B 0 11
1004 B 1 11
1006 B 1 11
1008 B 1 11
1009 B 1 22
1002 B 1 33
1003 B 1 33
1005 B 0 44
1007 B 1 44
I tried this query
SELECT A.*
FROM MyTable A
INNER JOIN MyTable B ON A.RepID = B.RepID
AND A.ICD = B.ICD
AND A.RandStat < B.RandStat
but this took me nowhere....
Any idea how to do that?
We can use ROW_NUMBER
here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ICD, RepID ORDER BY RandStat) rn
FROM yourTable
)
SELECT RepID, ICD, Diagnosed, RandStat
FROM cte
WHERE rn = 1
ORDER BY ICD, RandStat, RepID;