Search code examples
sqlsql-serversql-server-2008sql-server-2008-express

SQL: Deleting duplicate records in SQL Server


I have an sql server database, that I pre-loaded with a ton of rows of data.

Unfortunately, there is no primary key in the database, and there is now duplicate information in the table. I'm not concerned about there not being a primary key, but i am concerned about there being duplicates in the database...

Any thoughts? (Forgive me for being an sql server newb)


Solution

  • Well, this is one reason why you should have a primary key on the table. What version of SQL Server? For SQL Server 2005 and above:

    ;WITH r AS
    (
        SELECT col1, col2, col3, -- whatever columns make a "unique" row
        rn = ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY col1)
        FROM dbo.SomeTable
    )
    DELETE r WHERE rn > 1;
    

    Then, so you don't have to do this again tomorrow, and the next day, and the day after that, declare a primary key on the table.