Search code examples
sqlsql-serversql-update

How to use one of the variables to distinguish duplicated rows


For example: we have 3*2 duplicated rows as follows:

name   identity   gender
Mary   student    female 
Mary   student    female
Mary   student    female
Jack   teacher    male
Jack   teacher    male
Jack   teacher    male

I wanna make those 3 rows as follows:

name    identity   gender
Mary1   student    female 
Mary2   student    female
Mary3   student    female
Jack1   teacher    male
Jack2   teacher    male
Jack3   teacher    male

How could I do it? Thanks

I try to use create function tvValues, but it didn't work.


Solution

  • One approach uses row_number():

    select t.*,
        case when count(*) over(partition by name, identity, gender) > 1
            then concat(name, row_number() over(partition by name, identity, gender order by name))
            else name
        end as new_name
    from mytable t
    

    This appends a sequential number to names that belong to a duplicate row.

    We can turn the statement to an update with an updatable CTE :

    with cte as (
        select name,
            count(*) over(partition by name, identity, gender) cnt,
            row_number() over(partition by name, identity, gender order by name) rn
        from mytable
    )
    update cte
    set name = concat(name, rn)
    where cnt > 1