Search code examples
linqdistinctduplication

Distinct() method does not work?


I tried to use Distinct() to filter my collection to prevent duplication but my linq query still adds the same values to list.

thanks in advance.

 public ObservableCollection<string> CollectTopicsFromXml()
    {
         ObservableCollection<string> oc = new ObservableCollection<string>();
         XDocument xDoc = XDocument.Load(path);
         var topicColl = xDoc.Descendants("topic").Distinct();


         foreach (var topic in topicColl)
         {
             oc.Add(topic.Value);
         }

        return oc;
    }

Solution

  • Distinct by default uses reference equality unless Equals (and GetHashCode) are overridden on the item type. Since Equals is not overridden for XElement each element is "distinct" regardless of its contents.

    If you want distinct elements by Name or some other property (or combination of properties) you have a few options:

    • Project the elements to an anonymous type which does implement value equality by default:

      var topicColl = xDoc.Descendants("topic")
                          .Select(e => new {e.Name, e.Value})
                          .Distinct();
      
    • Use GroupBy, which allows an expression to be passed in

    • Create a class that implements IEqualityComparer<XElement> in the way that you want and pass that to Distinct
    • Use DistinctBy from MoreLinq which also allows an equality expression to be passed in