Search code examples
c#arrayslinqexcept

Using Linq Except with two lists of int arrays


Is it possible to use except with two lists of int arrays, like so:

List<int[]> a = new List<int[]>(){ new int[]{3,4,5}, new int[]{7,8,9}, new int[]{10,11,12}  };

List<int[]> b = new List<int[]>(){ new int[]{6,7,9}, new int[]{3,4,5}, new int[]{10,41,12}  };

var c = a.Except(b);

and exepecting {3,4,5} to be absent of the enumerable c? Of course I tried and this one is not working. Is there a solution as efficient as Except? Or even better, faster?


Solution

  • In .NET, arrays are only equal to another if they are the exact same array object. So two distinct arrays which have the same contents are not considered equal:

    int[] x = new int[] { 1, 2 };
    int[] y = new int[] { 1, 2 };
    Console.WriteLine(x == y); // false
    

    In order to check the equality based on the contents, you can use Enumerable.SequenceEqual:

    Console.WriteLine(x.SequenceEqual(y)); // true
    

    Of course that doesn’t help you directly when trying to use Enumerable.Except, since by default that will use the default equality comparer which only checks for equality (and since every array is inequal to every other array except itself…).

    So the solution would be to use the other overload, and provide a custom IEqualityComparer which compares the arrays based on their content.

    public class IntArrayEqualityComparer : IEqualityComparer<int[]>
    {
        public bool Equals(int[] a, int[] b)
        {
            return a.SequenceEqual(b);
        }
    
        public int GetHashCode(int[] a)
        {
            return a.Sum();
        }
    }
    

    Unfortunately, just delegating to SequenceEqual is not enough. We also have to provide a GetHashCode implementation for this to work. As a simple solution, we can use the sum of the numbers in the array here. Usually, we would want to provide a strong hash function, which tells a lot about the contents, but since we are only using this hash function for the Except call, we can use something simple here. (In general, we would also want to avoid creating a hash value from a mutable object)

    And when using that equality comparer, we correctly filter out the duplicate arrays:

    var c = a.Except(b, new IntArrayEqualityComparer());