Long story short, I took over a project and a table in the database is in serious need of de-duping. The table looks like this:
supply_req_id | int | [primary key]
supply_req_dt | datetime |
request_id | int | [foreign key]
supply_id | int | [foreign key]
is_disabled | bit |
The duplication is exists with records having the same request_id and supply_id. I'd like to find a best practice way to de-dupe this table.
[EDIT]
@Kirk_Broadhurst, thanks for the question. Since supply_req_id is not referenced anywhere else, I would answer by saying keep the first, delete any subsequent occurances.
Happy Holidays
This creates a rank for each row in the (supply_req_dt, request_id) grouping, starting with 1 = lowest supply_req_id. Any dupe has a value > 1
;WITH cDupes AS
(
SELECT
supply_req_id,
ROW_NUMBER() OVER (PARTITION BY supply_req_dt, request_id ORDER BY supply_req_id) AS RowNum
FROM
MyTable
)
DELETE
cDupes
WHERE
RowNum > 1
Then add a unique constraint or INDEX
CREATE UNIQUE INDEX IXU_NoDupes ON MyTable (supply_req_dt, request_id)