What would be the most effective way to open/use a SQL Server connection if we're reading rows to be deleted in batches?
foreach(IEnumerable<Log> logsPage in LogsPages)
{
foreach(Log logEntry in logsPage)
{
// 1. get associated filenames
// 2. delete row
// 3. try delete each file
}
}
Should we let Dapper open connections on each step of the foreach
loop? I suppose SQL Server connection pooling takes place here?
Or should we open an explicit connection per batch?
If you're performing multiple database operations in a tight loop, it would usually be preferable to open the connection for the duration of all the operations. Returning the connection to the pool can be beneficial in contested systems where there can be an indeterminate interval before the next database operation, but if you're doing lots of sequential operations: constantly fetching and returning connections from the pool (and executing sp_reset_connection
, which happens behind the scenes) add overhead for no good reason.
So to be explicit, I'd have the Open[Async]()
here above the first foreach
.
Note: for batching, you might find that there are ways of doing this with fewer round-trips, in particular making use of the IN
re-writing in Dapper based on the ids. Since you mention SQL-Server, This can be combined with setting a SqlMapper.Settings.InListStringSplitCount
to something positive (5, 10, etc are reasonable choices; note that this is a global setting); for example, for a simple scenario:
connection.Execute("delete from Foo where Id in @ids",
new { ids = rows.Select(x => x.Id) });
is much more efficient than:
foreach (var row in rows)
{
connection.Execute("delete from Foo where Id = @id",
new { id = row.Id });
}
Without InListStringSplitCount
, the first version will be re-written as something like:
delete from Foo where Id in (@ids0, @ids1, @ids2, ..., @idsN)
With InListStringSplitCount
, the first version will be re-written as something like:
delete from Foo where Id in (select cast([value] as int) from string_split(@ids,','))
which allows the exact same query to be used many times, which is good for query-plan re-use.