Search code examples
c#.netlinq-to-sql

IQueryable<T> extension method to take data in batches


Has anyone found/coded extension method that query data (using linq to sql) in batches? I've seen IEnumerable extensions but I'm looking for something I might use like this:

IQueryable<Order> orders = from i in db.Orders select i;
foreach(var batch in orders.InBatches(100))
{
   //batch of 100 products
   foreach(var order in batch)
   {
      //do something
   }
}

Solution

  • What you can do is this:

    public static IEnumerable<T[]> InBatches<T>(this IQueryable<T> collection, int size)
    {
        List<T> batch = new List<T>(size);
    
        foreach (T item in collection)
        {
            if (batch.Count == size)
            {
                yield return batch.ToArray();
                batch.Clear();
            }
    
            batch.Add(item);
        }
    
        if (batch.Count > 0) yield return batch.ToArray();
    }
    

    This extension method loops through the IQueryable once (i.e. one single query to the database) and spits out batches as arrays in a streaming fashion. By streaming the results, it allows just a subset of the data to be kept in memory (which is interesting for very large data sets).

    Usage:

    IQueryable<Order> orders = from order in db.Orders select order;
    foreach (Order[] batch in orders.InBatches(size: 100))
    {
       // Batch of 100 products
       foreach(Order order in batch)
       {
          //do something
       }
    }