Search code examples
c#entity-frameworkforeachbackgroundworkerout-of-memory

OutOfMemoryException in a long running foreach loop


I have a procedure which uses 2 sql databases - this is required in order to migrate some data from DB1 to DB2. Number of records to be migrated is around 500k. To connect to these 2 databases I use EF with 2 edmx files. A BackgroundWorker is used to perform this operation.

The code is pretty straightforward, i.e.:

string userState = "(Current action description)";
int idx = 0;
foreach (var o in DB1.Records)
{
    if (((BackgroundWorker)sender.CancellationPending)
    {
        e.Cancel = true;
        break;
    }

    // Check if the current record already exists in the new database, update it if it does 
    // and create a new one if it doesn't.
    NewRecord rec = DB2.NewRecords.Where(x => x.IDFromDB1 == o.ID).SingleOrDefault() ?? DB2.NewRecords.Where(x => x.UniqueID == o.UniqueID).SingleOrDefault();
    if (rec == null)
        rec = new NewRecord();

    // Since the primary key in Records table of DB1 database consists of 2 columns,
    // where ID is the first and Idx is the second, in a way that Idx marks the revision
    // and we want to select only the latest revision, we order by "Idx" in descending order
    // and take the first record.
    Record obj = DB1.Records.Where(x => x.ID == o.ID).OrderByDescending(x => x.Idx).First(); // <-- System.OutOfMemoryException after around 250k records have been processed

    /*
        ... some primitive processing code ...
    */

    if (rec.EntityKey == null)
        DB2.NewRecords.AddObject(rec);

    idx++;
    if (idx % 50 == 0)
    {
        DB2.SaveChanges();
        ((BackgroundWorker)sender).ReportProgress(idx, userState);
    }
}

DB2.SaveChanges();

The procedure threw an exception after some 250k records have been processed.
Why? Is there a better way of doing this?

Stack trace

at System.String.Concat(String str0, String str1)
at System.Data.Metadata.Edm.EdmType.CreateEdmTypeIdentity(String namespaceName, String name)
at System.Data.Metadata.Edm.EdmType.BuildIdentity(StringBuilder builder)
at System.Data.Metadata.Edm.EdmType.get_Identity()
at System.Data.Metadata.Edm.TypeUsage.BuildIdentity(StringBuilder builder)
at System.Data.Metadata.Edm.CollectionType.GetIdentity(TypeUsage typeUsage)
at System.Data.Metadata.Edm.CollectionType..ctor(TypeUsage elementType)
at System.Data.Common.CommandTrees.ExpressionBuilder.Internal.ArgumentValidation.ValidateProject(DbExpressionBinding input, DbExpression projection)
at System.Data.Query.PlanCompiler.CTreeGenerator.CreateProject(RelOpInfo sourceInfo, IEnumerable`1 outputVars)
at System.Data.Query.PlanCompiler.CTreeGenerator.BuildProjection(Node relOpNode, IEnumerable`1 projectionVars)
at System.Data.Query.PlanCompiler.CTreeGenerator.Visit(PhysicalProjectOp op, Node n)
at System.Data.Query.InternalTrees.PhysicalProjectOp.Accept[TResultType](BasicOpVisitorOfT`1 v, Node n)
at System.Data.Query.InternalTrees.BasicOpVisitorOfT`1.VisitNode(Node n)
at System.Data.Query.PlanCompiler.CTreeGenerator..ctor(Command itree, Node toConvert)
at System.Data.Query.PlanCompiler.ProviderCommandInfoUtils.Create(Command command, Node node, List`1 children)
at System.Data.Query.PlanCompiler.CodeGen.Process(List`1& childCommands, ColumnMap& resultColumnMap, Int32& columnCount)
at System.Data.Query.PlanCompiler.PlanCompiler.Compile(List`1& providerCommands, ColumnMap& resultColumnMap, Int32& columnCount, Set`1& entitySets)
at System.Data.EntityClient.EntityCommandDefinition..ctor(DbProviderFactory storeProviderFactory, DbCommandTree commandTree)
at System.Data.EntityClient.EntityProviderServices.CreateCommandDefinition(DbProviderFactory storeProviderFactory, DbCommandTree commandTree)
at System.Data.EntityClient.EntityProviderServices.CreateDbCommandDefinition(DbProviderManifest providerManifest, DbCommandTree commandTree)
at System.Data.Common.DbProviderServices.CreateCommandDefinition(DbCommandTree com mandTree)
at System.Data.Objects.Internal.ObjectQueryExecutionPlan.Prepare(ObjectContext context, DbQueryCommandTree tree, Type elementType, MergeOption mergeOption, Span span, ReadOnlyCollection`1 compiledQueryParameters)
at System.Data.Objects.ELinq.ELinqQueryState.GetExecutionPlan(Nullable`1 forMergeOption)
at System.Data.Objects.ObjectQuery`1.GetResults(Nullable`1 forMergeOption)
at System.Data.Objects.ObjectQuery`1.System.Collections.Generic.IEnumerable<T>.GetEnumerator()
at System.Linq.Enumerable.First[TSource](IEnumerable`1 source)
at System.Data.Objects.ELinq.ObjectQueryProvider.<GetElementFunction>b__0[TResult](IEnumerable`1 sequence)
at System.Data.Objects.ELinq.ObjectQueryProvider.ExecuteSingle[TResult](IEnumerable`1 query, Expression queryRoot)
at System.Data.Objects.ELinq.ObjectQueryProvider.System.Linq.IQueryProvider.Execute[S](Expression expression)
at System.Linq.Queryable.First[TSource](IQueryable`1 source)
at MyProject.frmPrenosIzAnalitike._worker_DoWork(Object sender, DoWorkEventArgs e) in C:\Users\dejan\Documents\Visual Studio 2010\Projects\MySolution\MyProject\frmMyForm.cs:line 245
at System.ComponentModel.BackgroundWorker.OnDoWork(DoWorkEventArgs e)
at System.ComponentModel.BackgroundWorker.WorkerThreadStart(Object argument)

Solution

  • This is because you're adding new entities to the context all the time. You basically don't want to load 250K entities in memory.

    Just detach them when you don't need them anymore

    // Assuming DB1 is your object context
    Record obj = DB1.Records.Where(x => x.ID == o.ID).OrderByDescending(x => x.Idx).First();
    
    ...
    
    DB1.Detach(obj); // <------ Detaches from context, removes from memory
    

    Same thing for DB2 and created entities.