Search code examples
c#linq

How can I do Left Outer Join with Linq when updating an object?


I'm trying to enhance a person object by doing a join. I want to do a left outer join on firstname and lastname. Then I want to update the person object with a method (this method returns a person object back) I have available and select from there.

The issue I have here is that there is a null reference exception and I'm not sure how to prevent that. If I remove the DefaultIfEmpty I get an inner join but obviously I want the left outer. I know typically you would want to handle nulls, but I'm unsure how to properly do that in this situation.

person = from p in person
    join epd in enhancedPersonData
        on new { p.firstname, p.lastname} equals new { epd.firstname, epd.lastname} into epdJoin
    from epdj in epdJoin.DefaultIfEmpty()
    select p.UpdatePerson(epdj);

I've tried the below but it is just an inner join:

person = from p in person
    join epd in enhancedPersonData
        on new { p.firstname, p.lastname} equals new { epd.firstname, epd.lastname}
    select p.UpdatePerson(epd);

Person:

public class Person
{
    public string Firstname{ get; set; }
    public string Lastname{ get; set; }
    public string Address{ get; set; }
    
    public Person(string firstname, string lastname){
        Firstname = firstname;
        Lastname = lastname;
    }
}

UpdatePerson:

public static Person UpdatePerson (this Person schema,
    EnhancedPersonData enhancedPersonData)
{
    if (!string.IsNullOrEmpty(enhancedPersonData.Address)) 
    {
        schema.Address = enhancedPersonData.Address;
    }
}

Solution:

UpdatePerson:

public static Person UpdatePerson (this Person schema,
    EnhancedPersonData enhancedPersonData)
{
    if (!string.IsNullOrEmpty(enhancedPersonData?.Address)) 
    {
        schema.Address = enhancedPersonData.Address;
    }
}

Join Code:

person = from p in person
    join epd in enhancedPersonData
        on new { p.firstname, p.lastname} equals new { epd.firstname, epd.lastname} into epdJoin
    from epdj in epdJoin.DefaultIfEmpty()
    select p.UpdatePerson(epdj);

Solution

  • If you are using a left join, then epdj will be null if there is no matching person in enhancedPersonData.

    What is the point in updating a person with null: p.UpdatePerson(null);?

    Since a class is a reference type, the original person collection will automatically contain the updated persons. It makes no sense to re-assign the returned persons to it. You can safely discard the result of the query, then the left join is no more necessary

    _ = from p in person
        join epd in enhancedPersonData
            on new { p.firstname, p.lastname} equals new { epd.firstname, epd.lastname}
        select p.UpdatePerson(epd);
    

    Note that in C# _ is a discard.


    Solutions

    1. Remove null reference exception

      After you have posted your code, I see where your null reference exception occurs. It is in the line

      if (!string.IsNullOrEmpty(enhancedPersonData.Address)) 
      

      If you use left join, then enhancedPersonData can be null. Therefore change the line to

      if (!String.IsNullOrEmpty(enhancedPersonData?.Address)) 
      
    2. Convert IEnumerable to List

      Other possible solution: Since person is an IEnumerable<Person>, it is unknown to me what the real data source is. Is it a collection or the result of a database query? In the latter case my approach (no left join, assign to _) won't work. However, you can make it work by declaring person as a List<Person> instead. Because otherwise, the real data source will be required every time you enumerate the IEnumerable<Person> person.

    Note that an IEnumerable<T> is not a collection. It provides a means to enumerate a collection or a data source.


    Also, the assignment in UpdatePerson should be

    schema.Address = enhancedPersonData.Address; // With a lower case enhancedPersonData
    

    instead of

    schema.Address = EnhancedPersonData.Address;
    

    otherwise you might be accessing a static property, which of course would have the same value for all objects.