Search code examples
sqlsql-servert-sqlsql-delete

Got mired in a selective DELETE statement on a single table in t-SQL


Say, I have a table like this (SQL Server 2008):

CREATE TABLE tbl (ID INT, dtIn DATETIME2, dtOut DATETIME2, Type INT)

INSERT tbl VALUES
(1, '05:00', '6:00', 1),  -- will be removed
(2, '05:00', '7:00', 1),  -- will be removed
(3, '05:01', '8:00', 1),
(4, '05:00', '8:00', 1),
(5, '05:00', '6:00', 2),  -- will be removed
(6, '05:00', '7:00', 2),
(7, '05:00', '7:00', 3),
(8, '04:00', '7:00', 3)

I need to remove all records of the same 'type' (if 2 or more are found) with the same 'dtIn' for their 'type', except the one with the largest 'dtOut'. In other words, the table above should result in this:

(3, '05:01', '8:00', 1),   -- no matching 'dtIn' for 'type' = 1
(4, '05:00', '8:00', 1),   -- largest 'dtOut' for 'type' = 1
(6, '05:00', '7:00', 2),   -- largest 'dtOut' for 'type' = 2
(7, '05:00', '7:00', 3),   -- no matching 'dtIn' for 'type' = 3
(8, '04:00', '7:00', 3)    -- no matching 'dtIn' for 'type' = 4

How do you even select several rows with the equal type from the same table. You can't do select * from tbl where type=type.... Anyway, I'd appreciate some help with this...


Solution

  • Here's one way to select the rows you want to delete:

    SELECT *
    FROM tbl T1
    WHERE EXISTS
    (
        SELECT *
        FROM tbl T2
        WHERE T1.Type = T2.Type
        AND T1.dtIn = T2.dtIn
        AND (
                T1.dtOut < T2.dtOut
                OR (T1.dtOut = T2.dtOut AND T1.id < T2.id)
            )
    )
    

    This query can also be easily changed to actually delete the rows. Just change SELECT * to DELETE T1. But please do test that it does what you want before actually running the delete statement.

    See it working online: sqlfiddle


    Update

    Here's an approach using ROW_NUMBER:

    ;WITH T1 AS (
        SELECT id, ROW_NUMBER() OVER (PARTITION BY Type, dtIn
                                      ORDER BY dtOut DESC, ID DESC) AS rn
        FROM tbl
    )
    SELECT * FROM tbl
    WHERE id IN
    (
        SELECT id
        FROM T1
        WHERE rn > 1
    )
    

    See it working online: sqlfiddle