Search code examples
sql-serverlinqlinq-to-sqlouter-joinlinqpad

Linq query where First() appears to return null when results are not found


I was presented with the following (this has been simplified for the question):

int programId = 3;
int refugeeId = 5;

var q = (  from st in Students
           join cr in Class_Rosters on st.StudentId equals cr.StudentId
           join pp in Student_Program_Part on cr.StudentId equals pp.StudentId
           from refg in (Student_Program_Participation_Values
                         .Where(rudf => rudf.ProgramParticipationId == pp.ProgramParticipationId 
                                     && rudf.UDFId == refugeeId)).DefaultIfEmpty()
          where cr.ClassId == 22898 
             && pp.ProgramId == programId
         select new
         {
             StudentId = st.StudentId,

              Refugees = refg.Value ?? "IT WAS NULL",
              Refugee = Student_Program_Participation_Values
                       .Where(rudf => rudf.ProgramParticipationId == pp.ProgramParticipationId 
                                   && rudf.RefugeeId == refugeeId)
                       .Select(rudf => (rudf.Value == null ? "IT WAS NULL" : "NOT NULL!"))
                       .First() ?? "First Returned NULL!",
          });
q.Dump();

In the above query the Student_Program_Participation_Values table does not have records for all students. The Refugees value properly returns the Value or "IT WAS NULL" when there is a missing record in Student_Program_Participation_Values. However, the Refugee column returns either "NOT NULL!" or "First Returned NULL!".

My question is, why is "First Returned NULL!" being seen since, in my experience with Linq, calling First() on an empty set should throw an exception, but in this query it appears to be doing something completely different. Note that refg.Value is never null in the database (it is either a valid value, or there is no record).

Note also that this is Linq to SQL and we are running this query in Linqpad.

To clarify, here is some sample output:

StudentId   Refugees          Refugee
22122       True              NOT NULL!
2332        IT WAS NULL       First Returned NULL!

In the above when Refugees returns "IT WAS NULL" there was no record in the Student_Program_Participation_Values table, so I expected First() to throw an exception, but instead it was null so Refugee shows "First Returned NULL!".

Any ideas?

Update: Enigmativity pushed me in the right direction by pointing out that I was stuck on the First() call when being a IQueryable the First() wasn't really a function call at all, but simply translated into "TOP 1" in the query. It was obvious when I looked at the generated SQL in LINQPad. Below is the important part of the generated SQL that makes it clear what is happening and why. I won't paste the entire thing since it's enormous and not germane to the discussion.

...
COALESCE((
    SELECT TOP (1) [t12].[value]
    FROM (
        SELECT 
            (CASE 
                WHEN 0 = 1 THEN 'IT WAS NULL'
                ELSE CONVERT(NVarChar(11), 'NOT NULL!')
             END) AS [value], [t11].[ProgramParticipationId], [t11].[UDFId]
        FROM [p_Student_Program_Participation_UDF_Values] AS [t11]
        ) AS [t12]
    WHERE ([t12].[ProgramParticipationId] = [t3].[ProgramParticipationId]) AND ([t12].[UDFId] = @p8)
), 'First Returned NULL!') AS [value3]
...

So, here you can clearly see that Linq converted the First() into TOP (1) and also determined that "IT WAS NULL" could never happen (thus the 0 = 1) since the whole thing is based on an outer join and the entire query simply coalesces into 'First Returned NULL!'.

So, it was all a perception mistake on my part not separating in my mind that Linq To SQL (and LINQ to Entities for that matter) is very different from calling the same-named methods on Lists and the like.

I hope my mistake is useful to someone else.


Solution

  • Without having your database I couldn't test this code, but try it anyway and see if it works.

    var q =
    (
        from st in Students
        join cr in Class_Rosters on st.StudentId equals cr.StudentId
        where cr.ClassId == 22898
        join pp in Student_Program_Part on cr.StudentId equals pp.StudentId
        where pp.ProgramId == programId
        select new
        {
            StudentId = st.StudentId,
            refg =
                Student_Program_Participation_Values
                    .Where(rudf =>
                        rudf.ProgramParticipationId == pp.ProgramParticipationId
                            && rudf.UDFId == refugeeId)
                    .ToArray()
        }
    ).ToArray();
    
    var q2 =
        from x in q
        from refg in x.refg.DefaultIfEmpty()
        select new
        {
            StudentId = x.StudentId,
            Refugees = refg.Value ?? "IT WAS NULL",
            Refugee = refg
                .Select(rudf => (rudf.Value == null ? "IT WAS NULL" : "NOT NULL!"))
                .First() ?? "First Returned NULL!",
        };
    
    q2.Dump();
    

    Basically the idea is to capture the records cleanly from the database, bring them in to memory, and then do all the null stuff. If this works then it is because of the failure to translate the LINQ into the same SQL. The translated SQL can sometimes be a little off so you don't get the results you expect. It's like translating English into French - you might not get the correct translation.