Search code examples
sqlsql-server-2000

How to Delete Duplicate Rows in SQL 2000?


I thought I had this figured out but it turns out I'm just deleting the first record. The following returns the duplicate rows. All have a count of 2. I just want to delete the first one for each duplicate record.

select scorestudentid, scoreadvisor, scorecorrect, count(*) 
from scores
where scoretestid = 3284
group by scorestudentid, scoreadvisor, scorecorrect
having count(scorestudentid) > 1

Which returns:

scorestudentid  scoreadvisor  scorecorrect  no column name
13033719        28059     3.0           2
13033777        28086     3.0           2
13033826        28147     3.0           2
13033960        28023     3.0           2

So I put this together thinking it would work:

set rowcount 1
delete
from scores
where scoretestid = 3284 
and scorestudentid in (
    select scorestudentid
    from scores
    where scoretestid = 3284
    group by scorestudentid
    having count(scorestudentid) > 1)

It really seems like it should be a simple concept but I'm not getting it.

Based on Thomas script I updated the query to fit but it still doesn't work.

Delete Scores
Where Exists    (
                Select 1
                From Scores As S2
                Where S2.ScoreStudentId = Scores.ScoreStudentId
                        And S2.ScoreAdvisor = Scores.ScoreAdvisor
                        And S2.ScoreCorrect = Scores.ScoreCorrect
                Group By S2.ScoreStudentId, S2.ScoreAdvisor, S2.ScoreCorrect
                Having Count(*) > 1
                    And Min(S2.NewScoreID) = Scores.NewScoreID
                )
    And Scores.ScoreTestId = 3284

Solution

  • The trick is using the Primary Key column (you do have one, correct?) and simply finding the first PK value that matches the criteria you want. If for some crazy reason you do not have a primary key column, then add an Identity column and make it the primary key and then do the delete.

    EDIT Revised to make it more generic. If you remove the final filter on ScoreTest, it will remove all duplicates based on ScoreStudentId, ScoreAdvisor and ScoreCorrect.

    Delete Scores
    Where Exists    (
                    Select 1
                    From Scores As S2
                    Where S2.ScoreStudentId = Scores.ScoresStudentId
                            And S2.ScoreAdvisor = Scores.ScoreAdvisor
                            And S2.ScoreCorrect = Scores.ScoreCorrect
                    Group By S2.ScoreStudentId, S2.ScoreAdvisor, S2.ScoreCorrect
                    Having Count(*) > 1
                        And Min(S2.PrimaryKeyColumn) = Scores.PrimaryKeyColumn
                    )
        And Scores.ScoreTest = 3284