Search code examples
sql-serverinsertrowcount

Move SQL Server data in limited (1000 row) chunks


I'm writing a process that archives rows from a SQL Server table based on a datetime column. I want to move all the rows with a date before X, but the problem is that there are millions of rows for each date, so doing a BEGIN TRANSACTION...INSERT...DELETE...COMMIT for each date takes too long, and locks up the database for other users.

Is there a way that I can do it in smaller chunks? Maybe using ROWCOUNT or something like that?

I'd originally considered something like this:

SET ROWCOUNT 1000

DECLARE @RowsLeft DATETIME
DECLARE @ArchiveDate DATETIME

SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)

WHILE @ROWSLEFT IS NOT NULL
BEGIN

    INSERT INTO EventsBackups
    SELECT top 1000 * FROM Events

    DELETE Events

    SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)

END

But then I realized that I can't guarantee that the rows I'm deleting are the ones I just backed up. Or can I...?

UPDATE: Another options I'd considered was adding a step:

  1. SELECT TOP 1000 rows that meet my date criteria into a temp table
  2. Begin Transaction
  3. Insert from temp table into archive table
  4. Delete from source table, joining to temp table across every column
  5. Commit transaction
  6. Repeat 1-5 until no rows remain that meet the date criteria

Does anybody have an idea for how the expense of this series might compare to some of the other options discussed below?

DETAIL: I'm using SQL 2005, since somebody asked.


Solution

  • Just INSERT the result of the DELETE:

    WHILE 1=1
    BEGIN
    
        WITH EventsTop1000 AS (
        SELECT TOP 1000 * 
            FROM Events
          WHERE <yourconditionofchoice>)
        DELETE EventsTop1000
            OUTPUT DELETED.* 
            INTO EventsBackup;
    
        IF (@@ROWCOUNT = 0)
            BREAK;
    END
    

    This is atomic and consistent.