Search code examples
sqlconditional-statementslogic

how to get individual-clinic-month that are excluded from SQL query


I have the following dataset:

individual  |  clinic_1   | clinic_2    | month      | address_recorded | address_code
1           |  A          | B           | 01-01-2016 | 01-02-1999       | C01
1           |  A          | A           | 01-01-2016 | 01-02-2003       | C02
1           |  A          | A           | 01-01-2016 | 01-02-2001       | C06
1           |  A          | X           | 01-01-2016 | 01-02-2000       | C03
2           |  C          | B           | 01-04-2016 | 01-02-1999       | D04
2           |  C          | A           | 01-04-2016 | 01-02-2001       | D05
2           |  C          | X           | 01-04-2016 | 01-02-2000       | D06

I would like to get:

individual  |  clinic_1   | month      | address_code
1           |  A          | 01-01-2016 | C02
2           |  C          | 01-04-2016 | D05

Criteria:

  • For unique set of individual-clinic_1-month with clinic_1 = clinic_2, select the most recent date in which address was recorded within clinic_1
  • For unique set of individual-clinic_1-month with NO instances where clinic_1 = clinic_2, select the most recent date in which address was recorded across clinics

I thought about doing:

with cte_1
as
(
select * from table
where clinic_1 = clinic_2
)
,cte_2
as
(
select row_number () over (Partition by clinic_1, individual, month order by clinic_1, individual, month, address_recorded desc) as number, *
from cte_1
)
select individual, clinic_1, month, address_code from cte_2 where number = 1

But I don't know how to get those individual-clinic_1-month for which there are no instances where clinic_1=clinic_2, any ideas?


Solution

  • You can Union two select queries; one to select all records where clinic_1=clinic_2 and another one to select all records where clinic_1<>clinic_2 and clinic_1 not in the results set of the first query.

    Both queries are grouped by [individual],[clinic_1], [clinic_2], [mnth] to find all of the required data rows for each [clinic_1] - [mnth] entry. Noting that for the 2nd query [clinic_2] is selected as ''.

    Check the following:

    with cte as
    (SELECT  [individual] ,[clinic_1],[clinic_2],[mnth],max([address_recorded]) as m
      FROM [MyData] where [clinic_1]=[clinic_2]
      group by [individual],[clinic_1],[clinic_2] ,[mnth] 
    ),
    cte2 as
    (SELECT  [MyData].[individual] ,[MyData].[clinic_1],'' as [clinic_2],[MyData].[mnth],max([MyData].[address_recorded]) as m
      FROM [MyData] 
      Left Join cte on cte.individual=MyData.individual
      and cte.mnth=MyData.mnth
      where [MyData].[clinic_1]<>[MyData].[clinic_2] and cte.individual IS NULL
      group by [MyData].[individual],[MyData].[clinic_1], [MyData].[mnth] 
    ),
    D as
    (SELECT * FROM cte
    UNION
    SELECT * FROM cte2)
    , 
    LastQr as(
    select [MyData].individual, [MyData].clinic_1,[MyData].mnth,[MyData].address_code,
    row_number() OVER(PARTITION BY [MyData].individual, [MyData].clinic_1,[MyData].mnth ORDER BY [MyData].individual, [MyData].clinic_1,[MyData].mnth)
    as rn from D
    INNER JOIN [MyData]
    ON D.individual=MyData.individual and D.clinic_1=MyData.clinic_1 and D.mnth=MyData.mnth and D.m=MyData.address_recorded
    and (D.clinic_2=MyData.clinic_2 or D.clinic_2='')
    )
    
    select * from LastQr where rn=1
    

    See the results from dbfiddle.uk.