Search code examples
sqlsql-serversql-server-2000

Delete rows per user and eventType keeping N rows


I want to delete the oldest entries in a table and keep N rows. This is fairly simple to do.

DELETE TOP(1000) from TABLE ORDER BY [date] DESC

But I want to delete rows based on User and EventType. So if we set N=50, I want to keep the newest 50 records per User and EventType.

My Table looks like this:

API_eventLog
- uid (PK, int, not null)
- eventTypeID (FK, int, not null)
- userGUID (FK, uniqueIdentifier, not null)
- date (datetime, not null)
- ...

There is a similar question already on SO that has the following answer, but unfortunately it is for SQL Server 2005.

;WITH Dealers AS (
    SELECT *,
    ROW_NUMBER() OVER(PARTITION BY DealerID ORDER BY SomeTimeStamp DESC) RowID
    FROM MyDealersTable
)
DELETE 
FROM Dealers
WHERE RowID > 50

Is there a solution to this in SQL server 2000 with good performance? All I can think of myself is a cursor-based solution but that is way to slow run be executed frequently.

Example Data:

[uid]   [EventTypeID]   [userGUID]  [date]
1   1   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-17
2   2   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-17
3   3   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-18
4   4   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-18
5   1   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-19
6   1   5B1DCB9D-4EC7-4AAE-BEB1-DC1EA90EA06B    2013-11-22
7   1   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-11-06
8   2   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-11-17
9   3   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-12-01
10  2   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-12-07
11  2   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-12-18
11  1   17941D18-CC79-4C29-BBBA-9CBE60993E43    2013-12-20

In above example, given N=2, I would like to delete row with [uid] 1 and 8. (ie keeping the 2 newest rows per User and EventTypeID.


Solution

  • The following query should simulate the ROW_NUMBER function in SQL Server 2000.

    DELETE API_eventLog
    WHERE id IN
      (SELECT 
        id
       FROM API_eventLog AS a1
       WHERE
         (SELECT COUNT(1)
           FROM API_eventLog AS a2
           WHERE a2.userguid = a1.userguid
           AND a2.eventTypeID = a1.eventTypeID
           AND (a2.date > a1.date)) > 0); 
    SELECT * FROM API_eventLog;
    

    Replace the 0 by 50 or whatever number or rows you want to keep.

    I can't test it in SQL Server 2000, but here is a working example in SQL Server 2008.

    EDIT:

    Seems like I made a small mistake. The date check should be greater than instead of less than if you want to keep the newest rows. I updated my example.