Search code examples
sqlrow-numberdense-rank

Potential Use of DENSE_RANK() in SQL


In SQL I'm looking for a potential function which I can use to apply a row number each time a certain criteria is met.

Each time the Unique ID is the same (i.e. same person) and the Dep_Group is WES then give that row the same number if they follow on from one another. What I'm trying to do is eventually squish those rows into one, hopefully by adding in a GROUP BY New_Col clause to the end. So in essence Line 1 & 2 become one line, also Line 5 & 6.

|Line| Uniq_ID | Start_DT      | End_DT        | Sequence | Department | Dep_Group | New_Col|
| 1  |   1     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     1  |
| 2  |   1     | 2023-01-02    | 2023-01-03    |    2     |     WES    |    WES    |     1  |
| 3  |   1     | 2023-01-02    | 2023-01-03    |    3     |     DEM    |    NULL   |     2  |
| 4  |   2     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     3  |
| 5  |   3     | 2023-01-02    | 2023-01-03    |    1     |     WES    |    WES    |     4  |
| 6  |   3     | 2023-01-02    | 2023-01-03    |    2     |     WES    |    WES    |     4  |
| 7  |   4     | 2023-01-02    | 2023-01-03    |    1     |     DEM    |    NULL   |     5  |

Example of the code I have so far which is not quite working as I wanted with DENSE_RANK():

SELECT Line
       ,Uniq_ID
       ,Start_DT
       ,End_DT
       ,Sequence
       ,Department
       ,Dep_Group
       ,DENSE_RANK() OVER (PARTITION BY Uniq_ID, Dep_Group ORDER BY Sequence) AS New_Col

FROM TblA

Solution

  • It is fairly straight forward: use running sum which increments if not (previous group = wes and current group = wes):

    with cte1 as (
      select *
           , case when lag(dep_group) over (partition by uniq_id order by sequence) = 'WES' and dep_group = 'WES' then 0 else 1 end as incr
      from t
    ), cte2 as (
      select *
           , sum(incr) over (partition by uniq_id order by sequence) as grp_num
       from cte1
    )
    select *
    from cte2
    

    This gives you the following result:

    line uniq_id start_dt end_dt sequence department dep_group incr grp_num
    1 1 2023-01-02 2023-01-03 1 WES WES 1 1
    2 1 2023-01-02 2023-01-03 2 WES WES 0 1
    3 1 2023-01-02 2023-01-03 3 DEM null 1 2
    4 2 2023-01-02 2023-01-03 1 WES WES 1 1
    5 3 2023-01-02 2023-01-03 1 WES WES 1 1
    6 3 2023-01-02 2023-01-03 2 WES WES 0 1
    7 4 2023-01-02 2023-01-03 1 DEM null 1 1

    The grp_num column can be used to group rows along with uniq_id.