Search code examples
c#asp.net-coreconcurrencyentity-framework-coreparallel.foreach

How to update a elements list in parallel loop with Entity Framework Core 6.0


I have to update a list with more than 60,000 elements. I want to do it using Parallel Loop but i'm getting this error.

Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct. Source=System.Private.CoreLib

this is my code

using (var context = new ApplicationDbContext())
{
    var itemList = await context.Items.ToListAsync();

    Parallel.ForEach(itemList, (item) =>
    {
        if (Helper.checkList.Contains(item.Link))
        {
            item.StatusId = (int)Emuns.Status.InStock;
        }
        else
        {
            item.StatusId = (int)Emuns.Status.OutStock;
        }

        context.Update(item);
    });

    context.SaveChanges();
}

I'm using Entity Framwork Core 6.0


Solution

  • Remove the context.Update(item) from the ForEach. It isn't required when using tracked instances. This should resolve the error in this particular scenario.

    Overall this isn't a particularly efficient way to apply the changes given you are loading all entities of that type into memory to then update based on a checklist value. Alternative approaches for a bulk operations like this would be to process a list of instock items against the database directly behind the scenes. (I.e. background job or SSIS etc.) where the contents of the list update can be recorded and applied so if there are any future questions about when/where particular products were marked in/out of stock there is an audit trail. Alternatively you could query values explicitly then iterate over the values that need their status changed:

    var itemsToMarkInStock = await context.Items
        .Where(x => x.Status != (int)Enums.Status.Instock
            && Helper.checklist.Contains(x.Link))
        .ToListAsync();
    var itemsToMarkOutOfStock = await context.Items
        .Where(x => x.Status == (int)Enums.Status.Instock
            && !Helper.checklist.Contains(x.Link))
        .ToListAsync();
    
    foreach(var item in itemsToMarkInstock)
         item.Status = (int)Enums.Status.Instock;
    foreach(var item in itemsToMarkOutOfStock)
         item.Status = (int)Enums.Status.OutStock;
    
    await context.SavchangesAsync();
    

    Provided the # of items being updated is expected to always be rather manageable in size. As a system grows this could get prohibitively expensive.

    In other situations where you want parallel operations that access data in parallel you would want to contain the scope of the DbContext within the parallel operation to avoid exceptions about cross-thread access against a DbContext:

    I.e.

    Parallel.ForEach(itemList, (item) =>
    {
       using (var context = new AppDbContext())
       {
            //operations
       }
    });
    

    This does mean that you cannot do something like load a tracked entity set outside of the parallel implementation then update those entities with references loaded by the DbContext scoped within the Parallel operation. To satisfy dependency injection for something like this, use an injectable DbContextScope or DbContextFactory pattern to provide the DbContext instances to use within the Parallel operation.