Search code examples
c#linqlambdalinqpad

How to compare two lists on a combination of two properties and select a row which has mismatch in the third property?


I want to compare the following two lists on a combination of two properties (Country and City).

When compared, India-Chennai is present in both the lists and have the same value (1). Similarly, UK-London also is present in both the lists and have the same value (3). However, though USA-New York is present on both the lists, their values are not matching (2 in list1 and 5 in list 2).

Please help me write the shortest possible linq expression to select only '2-USA-New York' from list1 as its value is not matching with list2 ('5-USA-New York').

void Main()
{
    List<A> list1 = new List<A> {
        new A { Country = "India", City = "Chennai", Value = 1 },
        new A { Country = "USA", City = "New York", Value = 2 },
        new A { Country = "UK", City = "London", Value = 3 }
    };

    List<A> list2 = new List<A> {
        new A { Country = "India", City = "Chennai", Value = 1 },
        new A { Country = "USA", City = "New York", Value = 5 },
        new A { Country = "UK", City = "London", Value = 3 }
    };

    list1.Dump();
    list2.Dump();
}

class A
{
    public int Value { get; set; }
    public string Country { get; set; }
    public string City { get; set; }
}

Solution

  • Assuming there are no duplicated { Country, City } pairs in your lists:

    var list1Missmatched = list1
        .Join(list2, 
              left => new { left.Country, left.City },
              right => new { right.Country, right.City }, 
              (left, right) => new { left, right })
        .Where(x => x.left.Value != x.right.Value)
        .Select(x => x.left)
        .ToList();
    

    This works, because in the leftList.Join(rightList, leftMatchBy, rightMatchBy, matchedPairResultSelector) we use 'anonymous object' as the key to be matched by. Equality (and hash code) of anonymous objects behaves as for a value type, i.e. new { Foo = 1 } and new { Foo = 1 } are equal and have same hash code, even though they are two different instances.

    Join builds a hash table out of (matchByKey, listItem) pairs, which allows for almost linear algorithmic complexity - O(n) (in contrast with the Where(Any()) solution, which is of a quadratic complexity - O(n^2)). If you are interested, recently I wrote a small performance test comparing these two approaches.