Search code examples
sql-serverentity-frameworklinqlinq-to-entitiesquery-performance

How can Entity Framework be forced to generate a SQL Inner Join on a nullable Foreign Key?


UPDATE: There is a request about this on the EF Team's UserVoice site. Vote it up

This thread about the Include statement is also related.

Why it's important

Querying with EF6 and Linq rocks! But if it involves a couple of Join Tables, and a nullable Foreign Key, it bogs down with 1000 lines of T-Sql.

If Inner Joins could be forced, it would perform with only 10 lines

For example, an EF6 project references a SQL database. There's a Student table and a Tutor table. Not every Student has a Tutor, so Student.TutorId can be null.

All Student-Tutor info is easily found with T-SQL:

SELECT s.Name, t.Name FROM Student s JOIN Tutor t ON s.TutorId = t.Id

The Linq is like so:

var result = context.Students
.Where(s => s.TutorId != null)
.Select(s => new { StudentName = s.Name, TutorName = s.Tutor.Name })
.ToList();

But EF6 generates this SQL:

SELECT [Extent1].[Name], 
    CASE WHEN ([Extent2].[FirstName] IS NULL)
       THEN N'' 
    ELSE
       [Extent2].[Name] 
    END AS [C1]
FROM  [dbo].[Student] AS [Extent1]
LEFT OUTER JOIN [dbo].[Tutor] AS [Extent2] ON [Extent1].[TutorId] = [Extent2].[Id]
WHERE [Extent1].[TutorId] IS NOT NULL

Thanks to Peter for asking about this many years ago. Hopefully there is a better answer now than to abandon Linq.

This GitHub Repository has source code for experimentation.


Solution

  • The only reliable way is if you can construct the LINQ query in such a way that the relationships are "navigated" from the required end to optional end through SelectMany, which I guess makes it not generally applicable.

    For demonstration purposes, if you write the sample query like this

    var result = db.Tutors
        .SelectMany(t => t.Students, (t, s) => new { StudentName = s.Name, TutorName = t.Name })
        .ToList();
    

    the generated SQL will be something like this

    SELECT
        [Extent1].[Id] AS [Id],
        [Extent2].[Name] AS [Name],
        [Extent1].[Name] AS [Name1]
        FROM  [dbo].[Tutors] AS [Extent1]
        INNER JOIN [dbo].[Students] AS [Extent2] ON [Extent1].[Id] = [Extent2].[TutorId]