Search code examples
c#sqlentity-framework-corecommon-table-expressionwindow-functions

Join a Sub-Query in EF Core


Goal: I want to select the latest versioned records of a given table:

id version name
1 0 Alek
1 1 Alex
2 0 Bob

Note: the temporal data feature is a no-go.

Ideally this is what I want EF Core to generate using CTE's:

WITH cte (Id, Version) AS 
(
    SELECT 
        Id, MAX(Version)
    FROM 
        Table
    GROUP BY 
        Id
)
SELECT a.*
FROM Table a
JOIN cte ON a.Id = cte.Id AND cte.Version = e.Version;

Note: here is an alternative but equivalent query if the above is impossible in EF Core:

SELECT a.*
FROM (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Version DESC) rn
    FROM Table a
    WHERE redacted
) a
WHERE a.rn = 1;

C# LINQ query:

var query = (from a in context.Test
            join j in (from m in context.Test
                        group m by m.Id
                        into g
                        select new { g.Key, Version = g.Max(x => x.Version) })
            on new { a.Id, a.Version } equals new { Id = j.Key, j.Version } into gj
            select a).ToQueryString();

ISSUE: the actual generated SQL query completely misses the sub-query:

SELECT [a].[Id], [a].[Version]
FROM [Employer] AS [e]

Update 2024 based on https://stackoverflow.com/users/14868997/charlieface answer

Working solution:

using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Metadata.Builders;

using var context = new MyDbContext()

var query = context.Entities
    .GroupBy(t => t.Id)
    .Select(g => g.OrderByDescending(t => t.Version).FirstOrDefault())
    .ToQueryString();

Console.WriteLine(query);

public class Entity {
    public required int Id { get; set; }
    public required int Version { get; set; }
    public required string Name { get; set; }
}

public class EntityTypeConfiguration : IEntityTypeConfiguration<Entity>
{
    public void Configure(EntityTypeBuilder<Entity> builder)
    {
        builder
            .HasKey(b => new { b.Id, b.Version });

        builder.Property(b => b.Id)
            .ValueGeneratedOnAdd();

        builder
            .Property(b => b.Name)
            .IsRequired();

        builder.Property(b => b.Version)
            .HasDefaultValue(0);
    }
}

public class MyDbContext : DbContext
{
    public DbSet<Entity> Entities { get; set; }

    public MyDbContext() : base() {}

    public MyDbContext(DbContextOptions<MyDbContext> options) : base(options) {}

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseSqlServer(@"Server=tcp:efcore-stackoverflow.database.windows.net,1433;Initial Catalog=efcore-stackoverflow;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Authentication=""Active Directory Default"";");
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.ApplyConfigurationsFromAssembly(typeof(EntityTypeConfiguration).Assembly);
    }
}

The standard output is as follows:

SELECT [e3].[Id], [e3].[Version], [e3].[Name]
FROM (
    SELECT [e].[Id]
    FROM [Entities] AS [e]
    GROUP BY [e].[Id]
) AS [e1]
LEFT JOIN (
    SELECT [e2].[Id], [e2].[Version], [e2].[Name]
    FROM (
        SELECT [e0].[Id], [e0].[Version], [e0].[Name], ROW_NUMBER() OVER(PARTITION BY [e0].[Id] ORDER BY [e0].[Version] DESC) AS [row]
        FROM [Entities] AS [e0]
    ) AS [e2]
    WHERE [e2].[row] <= 1
) AS [e3] ON [e1].[Id] = [e3].[Id]

Solution

  • The most correct way to do a row-number query in EF Core is to let it work that out for itself

    var query = 
        context.Test
        .GroupBy(t => t.Id)
        .Select(g => g.OrderByDescending(t => t.Version).First());
    

    I'm not sure why the joined query wasn't working, but it's not very efficient anyway.