Search code examples
c#entity-frameworksqliteentity-framework-6many-to-many

Entity Framework 6.2 copy many to many from one DbContext to another DbContext


When working with a network database such as MySQL, the DbContext should be short lived, but according to https://www.entityframeworktutorial.net/EntityFramework4.3/persistence-in-entity-framework.aspx the DbContext can be long lived when working with a local database, such as SQLite.

My app is using a long lived DbContext to work with SQLite on HDD and I want to copy many-to-many entities to another DbContext for the same type of SQLite database on USB.

I am using the Code-First approach.

public class Student
{
    public Student() 
    {
        this.Courses = new HashSet<Course>();
    }

    public int StudentId { get; set; }
    [Required]
    public string StudentName { get; set; }

    public virtual ICollection<Course> Courses { get; set; }
}

public class Course
{
    public Course()
    {
        this.Students = new HashSet<Student>();
    }

    public int CourseId { get; set; }
    public string CourseName { get; set; }

    public virtual ICollection<Student> Students { get; set; }
}

DbContextHDD contains students StudentA, StudentB and StudentC and courses Course1, Course2 and Course3:

StudentA attends Course1 and Course3
StudentB attends Course2 and Course3
StudentC attends Course1 and Course2

DbContextUSB contains no students and no courses.

var courses = DbContextHDD.Courses.AsNoTracking();
List<Student> students = new List<Student>();
foreach(Course course in courses)
{
    foreach(Student student in course.Students)
    {
        if(!students.Any(s => s.StudentId == student.StudentId))
        {
            students.Add(student);
        }
    }
}
Debug.WriteLine(students.Count); // output: 3

Debug.WriteLine(DbContextUSB.Students.Local.Count); // output: 0
DbContextUSB.Students.AddRange(students);
Debug.WriteLine(DbContextUSB.Students.Local.Count); // output: 4
DbContextUSB.SaveChanges(); // exception: UNIQUE constraint failed

DbContextUSB.Courses.AddRange(courses);
DbContextUSB.SaveChanges();

Why are there 4 students (3 unique and 1 duplicate) after I insert 3 unique students in to a DbSet with 0 students? What is the proper way to do this?

As I said, I am using a long lived DbContext because I am working with SQLite.


Solution

  • First, don't use AsNoTracking:

    var courses = DbContextHDD.Courses. ...
    

    Second, Include the required data:

    var courses = DbContextHDD.Courses
        .Include(c => c.Students)
        .ToList();
    

    Third, add the courses to the other context:

    DbContextUSB.Courses.AddRange(courses);
    DbContextUSB.SaveChanges();
    

    You may not believe it, but in essence that's all!

    One caveat is that you should disable proxy creation in the source context:

    DbContextHDD.Configuration.ProxyCreationEnabled = false;
    

    Otherwise EF creates proxy objects, which have a reference to the context they came from. They can't be attached to another context.

    Another is that there may be students that don't attend courses. You'll miss them when querying courses. So you have to add them separately:

    var lazyStudents = DbContextHDD.Students.Where(s => s.Courses.Count() == 0).ToList();
    ...
    DbContextUSB.Students.AddRange(lazyStudents);
    ...
    DbContextUSB.SaveChanges();
    

    Why does this work?

    • Without tracking, Entity Framework can't detect that StudentA in Course1 is the same student as in Course3. As a consequence, StudentA in Course3 is a new Student instance. You'll end up having 6 students, 3 duplicates (if there's no unique index on StudentName preventing this). With tracking, EF does detect that both courses have the same Student instance.

    • When adding an entity to a context, EF also marks nested entities as Added when they're not yet attached to the context. That's why it's enough to add courses only, and that's why EF doesn't complain when courses contain the same student instances.

    • Since the added courses have their Students collections properly populated, EF also inserts the required junction records in the StudentCourse table. This didn't happen in your code (well maybe, or partly, see later).

    Now why did you get 4 students?

    Look at the courses:

    Course1 StudentA*, StudentC*
    Course2 StudentB*, StudentC
    Course3 StudentA , StudentB
    

    Because of AsNoTracking all student are different instances, but only the marked* students are in students because of how you add them. But here's the tricky part. Even with AsNoTracking(), Entity Framework executes relationship fixup with related entities that are materialized in one query. That means that the foreach(Course course in courses) loop produces courses with populated Students collections of which each student has one course in its Courses collection. It's almost impossible to keep track of what exactly happens, esp. because debugging also triggers lazy loading, but for sure, the line...

    DbContextUSB.Students.AddRange(students);
    

    also marks their nested courses and their students as Added as far as they ended up being different instances. The end result in this case is that one more student instance is added to the cache. Also, a number of junction records was created but not necessarily the correct ones.

    The conclusion is that EF is a great tool for cloning object graphs, but the graph must be populated correctly, the right relationships and no duplicates, and should be added in one go.