Search code examples
c#linq

Finding items with exact subitems in two lists using LINQ


I have two lists of students who take courses in a school. My classes look like this:

public class Student
{
   public Guid Id { get; set; }
   public string Name { get; set; }
   public List<Course> Courses { get; set; } = new();
}

And the Course looks like this:

public class Course
{
   public Guid Id { get; set; }
   public string Name { get; set; }
}

I need to find students from both List A and List B who take the exact same courses.

Say, I have John in List A who takes English, Algebra and Chemistry. I want to find all students in List B who also take English, Algebra and Chemistry. If a student, say James, in List B takes these three courses but an additional course, it would NOT qualify. So the courses they take must match exactly.

And I'll be doing this for every student in List A.

I'm not sure how to handle the exact match of courses using LINQ. I thought about using Contains but I need to make sure ALL the courses student from List A is taking match the ones student from List B is taking.

I'm open to using nested foreach but I think LINQ could handle it too and possibly perform better.

I'd appreciate some pointers on this.


Solution

  • It seems that you are looking for GroupJoin: for each student from listA we want to get a group from listB with the same courses:

    var result = listA
      .GroupJoin(listB,
         a => string.Join(",", a.Courses.Select(course => course.Id).OrderBy(g => g)),
         b => string.Join(",", b.Courses.Select(course => course.Id).OrderBy(g => g)),
        (a, bs) => (student: a, list: bs.ToList()))
      .Where(group => group.list.Count > 0)
      .ToList(); // <- let's have a list
    

    For instance, for

      Course algebra = new Course() { Id = Guid.NewGuid(), Name = "Algebra" };
      Course geometry = new Course() { Id = Guid.NewGuid(), Name = "Geometry" };
      Course arithmetics = new Course() { Id = Guid.NewGuid(), Name = "Arithmetics" };
    
      List<Student> listA = new List<Student>() {
        new Student() { Id = Guid.NewGuid(), Name = "John", Courses = new List<Course>() { algebra, geometry} },
        new Student() { Id = Guid.NewGuid(), Name = "Anna", Courses = new List<Course>() { algebra } }
      };
    
      List<Student> listB = new List<Student>() {
        new Student() { Id = Guid.NewGuid(), Name = "Jack", Courses = new List<Course>() { algebra, geometry} },
        new Student() { Id = Guid.NewGuid(), Name = "Jill", Courses = new List<Course>() { algebra, geometry}, },
        new Student() { Id = Guid.NewGuid(), Name = "Natasha", Courses = new List<Course>() { algebra, geometry, arithmetics} }
      };
    

    If we run the query above and print it as

    var report = result
      .Select(group => $"{group.student.Name} : {string.Join(", ", group.list.Select(s => s.Name))}");
    
    Console.Write(string.Join(Environment.NewLine, report));
    

    We'll get

    John : Jack, Jill
    

    Please, fiddle