When working with a network database such as MySQL, the DbContext should be short lived, but according to https://www.entityframeworktutorial.net/EntityFramework4.3/persistence-in-entity-framework.aspx the DbContext can be long lived when working with a local database, such as SQLite.
My app is using a long lived DbContext to work with SQLite on HDD and I want to copy many-to-many entities to another DbContext for the same type of SQLite database on USB.
I am using the Code-First approach.
public class Student
{
public Student()
{
this.Courses = new HashSet<Course>();
}
public int StudentId { get; set; }
[Required]
public string StudentName { get; set; }
public virtual ICollection<Course> Courses { get; set; }
}
public class Course
{
public Course()
{
this.Students = new HashSet<Student>();
}
public int CourseId { get; set; }
public string CourseName { get; set; }
public virtual ICollection<Student> Students { get; set; }
}
DbContextHDD contains students StudentA, StudentB and StudentC and courses Course1, Course2 and Course3:
StudentA attends Course1 and Course3
StudentB attends Course2 and Course3
StudentC attends Course1 and Course2
DbContextUSB contains no students and no courses.
var courses = DbContextHDD.Courses.AsNoTracking();
List<Student> students = new List<Student>();
foreach(Course course in courses)
{
foreach(Student student in course.Students)
{
if(!students.Any(s => s.StudentId == student.StudentId))
{
students.Add(student);
}
}
}
Debug.WriteLine(students.Count); // output: 3
Debug.WriteLine(DbContextUSB.Students.Local.Count); // output: 0
DbContextUSB.Students.AddRange(students);
Debug.WriteLine(DbContextUSB.Students.Local.Count); // output: 4
DbContextUSB.SaveChanges(); // exception: UNIQUE constraint failed
DbContextUSB.Courses.AddRange(courses);
DbContextUSB.SaveChanges();
Why are there 4 students (3 unique and 1 duplicate) after I insert 3 unique students in to a DbSet with 0 students? What is the proper way to do this?
As I said, I am using a long lived DbContext because I am working with SQLite.
First, don't use AsNoTracking
:
var courses = DbContextHDD.Courses. ...
Second, Include
the required data:
var courses = DbContextHDD.Courses
.Include(c => c.Students)
.ToList();
Third, add the courses to the other context:
DbContextUSB.Courses.AddRange(courses);
DbContextUSB.SaveChanges();
You may not believe it, but in essence that's all!
One caveat is that you should disable proxy creation in the source context:
DbContextHDD.Configuration.ProxyCreationEnabled = false;
Otherwise EF creates proxy objects, which have a reference to the context they came from. They can't be attached to another context.
Another is that there may be students that don't attend courses. You'll miss them when querying courses. So you have to add them separately:
var lazyStudents = DbContextHDD.Students.Where(s => s.Courses.Count() == 0).ToList();
...
DbContextUSB.Students.AddRange(lazyStudents);
...
DbContextUSB.SaveChanges();
Why does this work?
Without tracking, Entity Framework can't detect that StudentA in
Course1 is the same student as in Course3. As a consequence, StudentA
in Course3 is a new Student
instance. You'll end up having 6 students, 3 duplicates (if there's no unique index on StudentName
preventing this). With tracking, EF does detect
that both courses have the same Student
instance.
When adding an entity to a context, EF also marks nested
entities as Added
when they're not yet attached to the context.
That's why it's enough to add courses only, and that's why EF doesn't
complain when courses contain the same student instances.
Since the added courses have their Students
collections properly populated, EF also inserts the required junction records in the StudentCourse
table. This didn't happen in your code (well maybe, or partly, see later).
Now why did you get 4 students?
Look at the courses:
Course1 StudentA*, StudentC*
Course2 StudentB*, StudentC
Course3 StudentA , StudentB
Because of AsNoTracking
all student are different instances, but only the marked* students are in students
because of how you add them. But here's the tricky part. Even with AsNoTracking()
, Entity Framework executes relationship fixup with related entities that are materialized in one query. That means that the foreach(Course course in courses)
loop produces courses with populated Students
collections of which each student has one course in its Courses
collection. It's almost impossible to keep track of what exactly happens, esp. because debugging also triggers lazy loading, but for sure, the line...
DbContextUSB.Students.AddRange(students);
also marks their nested courses and their students as Added
as far as they ended up being different instances. The end result in this case is that one more student instance is added to the cache. Also, a number of junction records was created but not necessarily the correct ones.
The conclusion is that EF is a great tool for cloning object graphs, but the graph must be populated correctly, the right relationships and no duplicates, and should be added in one go.