Search code examples
c#.netlinqouter-joinfull-outer-join

LINQ - Full Outer Join


I have a list of people's ID and their first name, and a list of people's ID and their surname. Some people don't have a first name and some don't have a surname; I'd like to do a full outer join on the two lists.

So the following lists:

ID  FirstName
--  ---------
 1  John
 2  Sue

ID  LastName
--  --------
 1  Doe
 3  Smith

Should produce:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue
 3             Smith

I have found quite a few solutions for 'LINQ Outer Joins' which all look quite similar, but really seem to be left outer joins.

My attempts so far go something like this:

private void OuterJoinTest()
{
    List<FirstName> firstNames = new List<FirstName>();
    firstNames.Add(new FirstName { ID = 1, Name = "John" });
    firstNames.Add(new FirstName { ID = 2, Name = "Sue" });

    List<LastName> lastNames = new List<LastName>();
    lastNames.Add(new LastName { ID = 1, Name = "Doe" });
    lastNames.Add(new LastName { ID = 3, Name = "Smith" });

    var outerJoin = from first in firstNames
        join last in lastNames
        on first.ID equals last.ID
        into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            id = first != null ? first.ID : last.ID,
            firstname = first != null ? first.Name : string.Empty,
            surname = last != null ? last.Name : string.Empty
        };
    }
}

public class FirstName
{
    public int ID;
    
    public string Name;
}
    
public class LastName
{
    public int ID;
    
    public string Name;
}

But this returns:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue

What am I doing wrong?


Solution

  • I don't know if this covers all cases, logically it seems correct. The idea is to take a left outer join and right outer join then take the union of the results.

    var firstNames = new[]
    {
        new { ID = 1, Name = "John" },
        new { ID = 2, Name = "Sue" },
    };
    var lastNames = new[]
    {
        new { ID = 1, Name = "Doe" },
        new { ID = 3, Name = "Smith" },
    };
    var leftOuterJoin =
        from first in firstNames
        join last in lastNames on first.ID equals last.ID into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            first.ID,
            FirstName = first.Name,
            LastName = last?.Name,
        };
    var rightOuterJoin =
        from last in lastNames
        join first in firstNames on last.ID equals first.ID into temp
        from first in temp.DefaultIfEmpty()
        select new
        {
            last.ID,
            FirstName = first?.Name,
            LastName = last.Name,
        };
    var fullOuterJoin = leftOuterJoin.Union(rightOuterJoin);
    

    This works as written since it is in LINQ to Objects. If LINQ to SQL or other, the query processor might not support safe navigation or other operations. You'd have to use the conditional operator to conditionally get the values.

    i.e.,

    var leftOuterJoin =
        from first in firstNames
        join last in lastNames on first.ID equals last.ID into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            first.ID,
            FirstName = first.Name,
            LastName = last != null ? last.Name : default,
        };