Search code examples
sqlsql-serversql-server-2008archive

Anyway to speed up the archiving of a large table


At midnight, I archive a SQL Server 2008 table by doing this in a stored procedure:

INSERT INTO archive(col1, col2....)
select col1, col2...
from tablename
where date <= @endDate

delete from tablename where date <= @enddate

Here is the table schema. I've changed the column names obviously. The archive table is exactly the same structure.

[col1] [uniqueidentifier] NOT NULL,
[col1] [bigint] NOT NULL,
[col1] [nvarchar](255) NOT NULL,
[col1] [nvarchar](255) NOT NULL,
[col1] [datetime] NOT NULL,
[col1] [nvarchar](75) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](50) NULL,
[col1] [nvarchar](50) NULL,
[col1] [nvarchar](1000) NULL,
[col1] [nvarchar](2) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,

The table typically has about 100,000 - 150,0000 rows with several indexes and is still having information written to it while I'm trying to perform this archive.

This process takes at the fastest, six minutes, and the slowest, 13 minutes.

Is there a faster way of doing this?


Solution

  • Partitioning is the fastest technique, but adds complexity and requires Enterprise Edition.

    An alternate approach is to combine the DELETE and the INSERT into one statement by using the OUTPUT clause. http://msdn.microsoft.com/en-us/library/ms177564.aspx. A DELETE with an OUTPUT clause is faster than individual INSERT/DELETE statements.

    DELETE FROM tablename 
        OUTPUT DELETED.Col1, DELETED.col2, DELETED.col3 DELETED.col4 -- etc
        INTO archive ( col1, col2, col3, col4 )
     WHERE date <= @enddate;
    

    If you have issues with blocking due to the concurrent inserts, then you can batch the above statement by doing a loop:

    DECLARE @i int
    SET @i = 1 
    WHILE @i > 0
    BEGIN
        DELETE  top (1000) FROM tablename 
            OUTPUT DELETED.Col1, DELETED.col2, DELETED.col3 DELETED.col4 -- Eric
            INTO archive ( col1, col2, col3, col4 )
        WHERE date <= @enddate  
        SET @i = @@rowcount
    END    
    

    Additional note: There are a few restrictions for the output table. It can't have triggers, be involved in foreign keys or have check constraints.