Search code examples
sqlsql-serversql-updategreatest-n-per-groupsql-delete

How do I merge and delete duplicated rows in SQL using UPDATE?


For example, I have a table of:

id | code | name | type | deviceType
---+------+------+------+-----------
1  | 23   | xyz  | 0    | web
2  | 23   | xyz  | 0    | mobile
3  | 24   | xyzc | 0    | web
4  | 25   | xyzc | 0    | web

I want the result to be:

id | code | name | type | deviceType
---+------+------+------+-----------
1  | 23   | xyz  | 0    | web&mobile
2  | 24   | xyzc | 0    | web
3  | 25   | xyzc | 0    | web

How do I do this in SQL Server using UPDATE and DELETE statements?

Any help is greatly appreciated!


Solution

  • To do what you want, you would need two separate statements.

    This updates the "first" row of each group with all the device types in the group:

    update t
    set t.devicetype = t1.devicetype
    from mytable t
    inner join (
        select min(id) as id, string_agg(devicetype, '&') within group(order by id) as devicetype
        from mytable
        group by code, name, type
        having count(*) > 1
    ) t1 on t1.id = t.id
    

    This deletes everything but the first row per group:

    with t as (
        select row_number() over(partition by code, name, type order by id) rn
        from mytable
    ) 
    delete from t where rn > 1
    

    Demo on DB Fiddle