Search code examples
linqentity-framework-coreef-core-7.0

Generated SQL from Linq GroupJoin is ignoring all conditions


I'm trying to query the number of tickets open per hour in a single day using LINQ to SQL query syntax. I'm getting the right results however the generated SQL is very different from what I would expect.

My query on LinqPad Is:

int[] hours = Enumerable.Range(0, 24).ToArray();
DateTime Nov = new DateTime(2023, 11, 14);
var q = (
            from h in hours
            join t in Tickets on h equals t.CreatedAt.Hour into ht
            from lht in ht.Where(tk => (tk.CreatedAt.Date == Nov.Date) && (tk.OrganizationId == 1) && (!tk.IsDeleted) ).DefaultIfEmpty()
            group lht by h into ftable
            select new {id = ftable.Key, count  = ftable.Count(f => f != null)}
        ).Dump();

and for some reason, the generated SQL query is a simple selection of the entire tickets table:

SELECT `t`.`Id`, `t`.`ClosedAt`, `t`.`CreatedAt`, `t`.`CreatedBy`, `t`.`isDeleted`, `t`.`LastUpdatedAt`, `t`.`LastUpdatedBy`, `t`.`LinkedSessionId`, `t`.`OrganizationId`, `t`.`Priority`, `t`.`PropertyId`, `t`.`PublicId`, `t`.`RecipientId`, `t`.`Source`, `t`.`Status`, `t`.`Subject`, `t`.`TicketId`
FROM `tickets` AS `t`

I'm not sure what is happening here but it looks like the line join t in Tickets on h equals t.CreatedAt.Hour into ht is simply fetching the whole table and then doing the rest of the grouping and where conditions in memory, which is hugely inefficient.

Has anyone encountered something like this before or can explain this behavior, and how can I force it to generate a normal LEFT JOIN with WHERE and GROUP BY clauses in the generated sql.


Solution

  • If you start query from local collection, it means you will use Enumerable extensions which means loading whole tables into memory and no filters will be applied. I would suggest to group existing data on the server and later enrich data with missed hours:

    DateTime Nov = new DateTime(2023, 11, 14);
    var startDate = Nov;
    var endDate = startDate.AddDays(1);
    
    var query =
        from t in Tickets
        where t.CreatedAt >= startDate && t.CreatedAt < endDate
        group t by t.CreatedAt.Hour into g
        select new 
        {
            id = g.Key, 
            count = g.Count()
        };
    
    int[] hours = Enumerable.Range(0, 24).ToArray();
    
    var enriched = 
        from h in hours
        join d in query on h equals d.id into gj
        from d in gj.DefaultIfEmpty()
        select new
        {
            id = h,
            count = d == null ? 0 : d.count
        };
    
    enriched.Dump();