Search code examples
sqlsql-servert-sqlduplicatessql-delete

How to delete duplicate rows that are exactly the same in SQL Server


I loaded some data into a SQL Server table from a .CSV file for test purposes, I don't have any primary key, unique key or auto-generated ID in that table.

Helow is an example of the situation:

select * 
from people
where name in (select name 
               from people
               group by name
               having count(name) > 1)

When I run this query, I get these results:

enter image description here

The goal is to keep one row and remove other duplicate rows.

Is there any way other than save the content somewhere else, delete all duplicate rows and insert a new one?

Thanks for helping!


Solution

  • You could use an updatable CTE for this.

    If you want to delete rows that are exact duplicates on the three columns (as shown in your sample data and explained in the question):

    with cte as (
        select row_number() over(partition by name, age, gender order by (select null)) rn
        from people
    )
    delete from cte where rn > 1
    

    If you want to delete duplicates on name only (as shown in your existing query):

    with cte as (
        select row_number() over(partition by name order by (select null)) rn
        from people
    )
    delete from cte where rn > 1