Search code examples
c#sql-serverlinqlinq-to-entities

Groupby inside groupby and All() performance


I am having a query which groups data twice:

var query = (from a in Context.SetA()
from b in Context.SetB().Where(x => x.aId == a.Id) // 1-to-many
from c in Context.SetC().Where(x => x.bId == b.Id) // 1-to-many
group new {...} by new {...} into g
select new
{
    g.Key.X,
    g.Key.Y,
    g.Sum(x => x....), // etc, lots of Sums
});

from (q in query
group q by true
select new 
{
    g.Key.Z,
    g.Sum(x => x.....), // etc, lots of Sums
});

This all works well, and performance is acceptable. Now when I add All (or Any, makes no difference) in both groupings

g.All(x => x.Flag)

performance drops drastically. It becomes like 10x slower, and goes from 5s to 50s. Numbers are not accurate, its just to get the idea.

I am doing grouping twice, since I have three levels of 1-to-many relationships, and I do summary on all levels (Sum of parent, sum of children and sum of grandchildren values).

Any ideas what can I do to improve performance?


Solution

  • The problem is that there are not natural SQL GROUP BY aggregates which can be mapped to All / Any applied to the grouping, so the EF SQL translation is inefficient.

    The solution is to use their aggregate equivalents.

    So instead of

    g.All(x => x.Flag)
    

    you could use

    g.Min(x => x.Flag ? 1 : 0) == 1
    

    and respectively

    g.Any(x => x.Flag)
    

    can be replaced with

    g.Max(x => x.Flag ? 1 : 0) == 1
    

    Update: Strangely, EF generates 2 MIN / MAX calls for the above expressions. Which is fixed by adding at the end (after == 1) the counterintuitive ? true : false