Search code examples
c#performanceentity-frameworksql-server-2008

How do I speed up DbSet.Add()?


I have to import about 30k rows from a CSV file to my SQL database, this sadly takes 20 minutes.

Troubleshooting with a profiler shows me that DbSet.Add is taking the most time, but why?

I have these Entity Framework Code-First classes:

public class Article
{
    // About 20 properties, each property doesn't store excessive amounts of data
}

public class Database : DbContext
{
    public DbSet<Article> Articles { get; set; }
}

For each item in my for loop I do:

db.Articles.Add(article);

Outside the for loop I do:

db.SaveChanges();

It's connected with my local SQLExpress server, but I guess there isn't anything written till SaveChanges is being called so I guess the server won't be the problem....


Solution

  • Each item in a unit-of-work has overhead, as it must check (and update) the identity manager, add to various collections, etc.

    The first thing I would try is batching into, say, groups of 500 (change that number to suit), starting with a fresh (new) object-context each time - as otherwise you can reasonably expect telescoping performance. Breaking it into batches also prevents a megalithic transaction bringing everything to a stop.

    Beyond that; SqlBulkCopy. It is designed for large imports with minimal overhead. It isn't EF though.