Search code examples
sqlsql-serverdatabaset-sql

Select into by batches in SQL Server


I have a multi-million result query in SQL Server whose result I need to drop into new table. Using INTO dbo.new_table does the job but when record count goes in millions, it takes hours.

Is there a way of batch/loop data load with SELECT INTO dbo.new_table?

--15 million records returned
--40 columns
SELECT
 col,
 col,
 col,
 col_n
INTO dbo.new_table
FROM tbl LEFT JOIN tbl_a, LEFT JOIN tbl_b LEFT JOIN tbl_n
ORDER BY 1

--would it be possible to FETCH/LIMIT batch while using INTO?

Statistics IO (done for 100 records as 15 million result takes almost 2 hours)

Total logical reads = 2,641

(100 rows affected)
Table 'one'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'two'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'three'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'four'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'five'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'six'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'seven'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'eight'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'nine'. Scan count 1, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Set statistics Time:

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 2 ms.
SQL Server parse and compile time: 
   CPU time = 12 ms, elapsed time = 12 ms.

(100 rows affected)

Solution

  • Please check the execution plan if there is area to improve like creating indexes.

    But if you want to use loop to batch insert you can try below example:

        DECLARE @id_control INT
        DECLARE @batchSize INT
        DECLARE @results INT
        
        SET @results = 1
        SET @batchSize = 1000000
        SET @id_control = 0
        
        WHILE (@results > 0)
        BEGIN
           -- put your custom code here
    insert INTO dbo.new_table
     SELECT
     col,
     col,
     col,
     col_n
    
    FROM tbl LEFT JOIN tbl_a LEFT JOIN tbl_b LEFT JOIN tbl_n
    idcol <= @id_control + @batchSize
    ORDER BY idcol
    
           
           -- very important to obtain the latest rowcount to avoid infinite loops
           SET @results = @@ROWCOUNT
        
           -- next batch
           SET @id_control = @id_control + @batchSize
        END
    

    Instead of idcol you need to use the unique column name on which you are ordering the data.