Search code examples
c#linqdeduplication

Remove duplicates from list based on multiple fields or columns


I have a list of type MyClass

public class MyClass
{
   public string prop1 {} 
   public int prop2 {} 
   public string prop3 {} 
   public int prop4 {} 
   public string prop5 {} 
   public string prop6 {} 
   ....
}

This list will have duplicates. I want to find and remove items from this list where prop1, prop2 and prop3 are duplicates. It doesnt matter if the other properties are duplicates

This is what I have tried that is not working.

List<MyClass> noDups = myClassList.GroupBy(d => new {d.prop1,d.prop2,d.prop3} ).Where(g => g.Count() > 1).Select(g=> g.Key);

I dont want to use any third party tools for this. Only pure linq.


Solution

  • This will return one item for each "type" (like a Distinct) (so if you have A, A, B, C it will return A, B, C)

    List<MyClass> noDups = myClassList.GroupBy(d => new {d.prop1,d.prop2,d.prop3} )
                                      .Select(d => d.First())
                                      .ToList();
    

    If you want only the elements that don't have a duplicate (so if you have A, A, B, C it will return B, C):

    List<MyClass> noDups = myClassList.GroupBy(d => new {d.prop1,d.prop2,d.prop3} )
                                      .Where(d => d.Count() == 1)
                                      .Select(d => d.First())
                                      .ToList();