Search code examples
sqlsasteradatarow-number

Resetting ROW_NUMBER count based on 3 month gap in date


Looking for a solution on how to start the ROW_NUMBER count over if there is a gap in a date field (clm_line_srvc_strt_dt) of more than 3 months.

Currently using the following for the row count:

,ROW_NUMBER() 
OVER(PARTITION BY c.src_sbscrbr_id
, c.src_mbr_sqnc_nbr
, cl.hlth_srvc_cd 
, df.serv_prov_id
ORDER BY c.src_sbscrbr_id
, c.src_mbr_sqnc_nbr
, cl.hlth_srvc_cd 
, df.serv_prov_id
, cl.clm_line_srvc_strt_dt) as rncnt

For example: Member ID 011 has three records with the clm_line_srvc_strt_dt field being 28Dec2017, 28Apr2018, and 28Jul2018.

Currently the row count for these 3 lines are 1, 2, 3.

The row count should reset to 1 with the 28Apr2018 line since the gap between Dec 28 2017 and Apr 28 2018 is greater than 3 months. So it should be 1, 1, 2, with the third line continuing to count since the gap between 28Apr2018 and 28Jul2018 is not greater than 3 months.


Solution

  • Teradata supports a proprietary extension to Windowed Aggregates, RESET WHEN, which adds a kind of dynamic partition:

    Row_Number() 
    Over(PARTITION BY c.src_sbscrbr_id
                    , c.src_mbr_sqnc_nbr
                    , cl.hlth_srvc_cd 
                    , df.serv_prov_id
         ORDER BY  cl.clm_line_srvc_strt_dt
         -- restart the row number when the previous date is more than 3 months ago
         RESET WHEN Min(clm_line_srvc_strt_dt)
                    Over (PARTITION BY c.src_sbscrbr_id
                                     , c.src_mbr_sqnc_nbr
                                     , cl.hlth_srvc_cd 
                                     , df.serv_prov_id
                          ORDER BY cl.clm_line_srvc_strt_dt
                          ROWS BETWEEN 1 Preceding AND 1 Preceding) 
                   < Add_Months(cl.clm_line_srvc_strt_dt, -3)) AS rncnt
    

    Based on the error message in Gordon's answer your Teradata version doesn't support LAG, yet (must be 16.10+). The MIN is the same as:

                    LAG(clm_line_srvc_strt_dt)
                    Over (PARTITION BY c.src_sbscrbr_id
                                     , c.src_mbr_sqnc_nbr
                                     , cl.hlth_srvc_cd 
                                     , df.serv_prov_id
                          ORDER BY cl.clm_line_srvc_strt_dt)  
    

    Btw, there's no need to ORDER by the same columns you already use in PARTITION (within a partition it's the same value anyway)