I have a multi-million result query in SQL Server whose result I need to drop into new table. Using INTO dbo.new_table
does the job but when record count goes in millions, it takes hours.
Is there a way of batch/loop data load with SELECT INTO dbo.new_table
?
--15 million records returned
--40 columns
SELECT
col,
col,
col,
col_n
INTO dbo.new_table
FROM tbl LEFT JOIN tbl_a, LEFT JOIN tbl_b LEFT JOIN tbl_n
ORDER BY 1
--would it be possible to FETCH/LIMIT batch while using INTO?
Statistics IO (done for 100 records as 15 million result takes almost 2 hours)
Total logical reads = 2,641
(100 rows affected)
Table 'one'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'two'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'three'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'four'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'five'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'six'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'seven'. Scan count 0, logical reads 415, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'eight'. Scan count 1, logical reads 300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'nine'. Scan count 1, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Set statistics Time:
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.
SQL Server parse and compile time:
CPU time = 12 ms, elapsed time = 12 ms.
(100 rows affected)
Please check the execution plan if there is area to improve like creating indexes.
But if you want to use loop to batch insert you can try below example:
DECLARE @id_control INT
DECLARE @batchSize INT
DECLARE @results INT
SET @results = 1
SET @batchSize = 1000000
SET @id_control = 0
WHILE (@results > 0)
BEGIN
-- put your custom code here
insert INTO dbo.new_table
SELECT
col,
col,
col,
col_n
FROM tbl LEFT JOIN tbl_a LEFT JOIN tbl_b LEFT JOIN tbl_n
idcol <= @id_control + @batchSize
ORDER BY idcol
-- very important to obtain the latest rowcount to avoid infinite loops
SET @results = @@ROWCOUNT
-- next batch
SET @id_control = @id_control + @batchSize
END
Instead of idcol
you need to use the unique column name on which you are ordering the data.