Search code examples
c#sql-serverlinqentity-framework-coreselect-n-plus-1

EF Core nested Linq select results in N + 1 SQL queries


I have a data model where a 'Top' object has between 0 and N 'Sub' objects. In SQL this is achieved with a foreign key dbo.Sub.TopId.

var query = context.Top
    //.Include(t => t.Sub) Doesn't seem to do anything
    .Select(t => new {
        prop1 = t.C1,
        prop2 = t.Sub.Select(s => new {
            prop21 = s.C3 //C3 is a column in the table 'Sub'
        })
        //.ToArray() results in N + 1 queries
    });
var res = query.ToArray();

In Entity Framework 6 (with lazy-loading off) this Linq query would be converted to a single SQL query. The result would be fully loaded, so res[0].prop2 would be an IEnumerable<SomeAnonymousType> which is already filled.

When using EntityFrameworkCore (NuGet v1.1.0) however the sub-collection is not yet loaded and is of type:

System.Linq.Enumerable.WhereSelectEnumerableIterator<Microsoft.EntityFrameworkCore.Storage.ValueBuffer, <>f__AnonymousType1<string>>.

The data will not be loaded until you iterate over it, resulting in N + 1 queries. When i add .ToArray() to the query (as shown in comments) the data gets fully loaded into var res, using a SQL profiler however shows this isn't achieved in 1 SQL query anymore. For each 'Top' object a query on the 'Sub' table is executed.

First specifying .Include(t => t.Sub) doesn't seem to change anything. The use of anonymous types doesn't seem to be the problem either, replacing the new { ... } blocks with new MyPocoClass { ... } doesn't change anything.

My question is: Is there a way to get behavior similar to EF6, where all data is loaded immediately?


Note: i realize that in this example the problem can be fixed by creating the anonymous objects in memory after executing the query like so:

var query2 = context.Top
    .Include(t => t.Sub)
    .ToArray()
    .Select(t => new //... select what is needed, fill anonymous types

However this is just an example, i do actually need the creation of objects to be part of the Linq query as AutoMapper uses this to fill DTOs in my project


Update: Tested with the new EF Core 2.0, issue is stil present. (21-08-2017)

Issue is tracked on aspnet/EntityFrameworkCore GitHub repo: Issue 4007

Update: A year later, this issue has been fixed in version 2.1.0-preview1-final. (2018-03-01)

Update: EF version 2.1 has been released, it includes a fix. see my answer below. (2018-05-31)


Solution

  • The GitHub issue #4007 has been marked as closed-fixed for milestone 2.1.0-preview1. And now the 2.1 preview1 has been made available on NuGet as discussed in this .NET Blog post.

    Version 2.1 proper is also released, install it with the following command:

    Install-Package Microsoft.EntityFrameworkCore.SqlServer -Version 2.1.0
    

    Then use .ToList() on the nested .Select(x => ...) to indicate the result should be fetched immediately. For my original question this looks like this:

    var query = context.Top
        .Select(t => new {
            prop1 = t.C1,
            prop2 = t.Sub.Select(s => new {
                prop21 = s.C3
            })
            .ToList() // <-- Add this
        });
    var res = query.ToArray(); // Execute the Linq query
    

    This results in 2 SQL queries being run on the database (instead of N + 1); First a plain SELECT FROM the 'Top' table and then a SELECT FROM the 'Sub' table with an INNER JOIN FROM the 'Top' table, based on Key-ForeignKey relation [Sub].[TopId] = [Top].[Id]. The results of these queries are then combined in memory.

    The result is exactly what you would expect and very similar to what EF6 would have returned: An array of anonymous type 'a which has properties prop1 and prop2 where prop2 is a List of anonymous type 'b which has a property prop21. Most importantly all of this is fully loaded after the .ToArray() call!