I want to build an expression for IQueryable GroupBy. While at the moment I'm just simplifying the problem to try and get it working, the eventual final implementation will involve the creation of quite complex expression trees so I want to build a complete expression that can then be integrated into other expressions.
I specifically want to build an expression of this overload:
public static System.Linq.IQueryable<TResult> GroupBy<TSource,TKey,TResult> (
this System.Linq.IQueryable<TSource> source,
System.Linq.Expressions.Expression<Func<TSource,TKey>> keySelector,
System.Linq.Expressions.Expression<Func<TKey,System.Collections.Generic.IEnumerable<TSource>,TResult>> resultSelector);
... my problem is in the implementation of the resultSelector and and the IEnumerable<TSource>.
I have a table of Customers (just dummy data for the purposes of working out this problem). This is stored in an SQL DB and I specifically want to use IQueryable to access the data.
public class Customer
{
public int Id { get; set; }
public string? FirstName { get; set; }
public string? LastName { get; set; }
public int Age { get; set; }
}
I also have a GroupResult class used to hold the results of the GroupBy (I have different constructors which I've been using in my testing to work out where my problem is occurring)
internal class GroupResult
{
public string? Name { get; set; }
public int NumRecords { get; set; }
public decimal AverageAge { get; set; }
public int TotalAge { get; set; }
public GroupResult() { }
public GroupResult(string name)
{
Name = name;
}
public GroupResult(IEnumerable<Customer> customers)
{
Name = Guid.NewGuid().ToString();
NumRecords = customers.Count();
}
public GroupResult(string name, IEnumerable<Customer> customers)
{
Name = name;
NumRecords = customers.Count();
}
}
The main static class that displays prompts to select column to group on, creates the relevant expression tree and executes it
internal static class SimpleGroupByCustomer
{
internal static DataContext db;
internal static void Execute()
{
using (db = new DataContext())
{
//get input
Console.WriteLine();
Console.WriteLine("Simple Customer GroupBy");
Console.WriteLine("=======================");
Console.WriteLine("Simple GroupBy on the Customer Table");
Console.WriteLine();
Console.WriteLine("Select the property that you want to group by.");
Console.WriteLine();
var dbSet = db.Set<Customer>();
var query = dbSet.AsQueryable();
//for this example we're just prompting for a column in the customer table
//GetColumnName is a helper function that lists the available columns and allows
//one to be selected
string colName = Wrapper.GetColumnName("Customer");
MethodInfo? method = typeof(SimpleGroupByCustomer).GetMethod("GetGroupBy",
BindingFlags.Static | BindingFlags.NonPublic);
if (method != null)
{
method = method.MakeGenericMethod(new Type[] { typeof(String), query.ElementType });
method.Invoke(null, new object[] { query, colName });
}
}
}
internal static void GetGroupBy<T, TTable>(IQueryable query, string colName)
{
Type TTmp = typeof(TTable);
var param = Expression.Parameter(TTmp, "c");
var prop = Expression.PropertyOrField(param, colName);
LambdaExpression keySelector = Expression.Lambda<Func<TTable, T>>(prop, param);
var param1 = Expression.Parameter(typeof(T), "Key");
var param2 = Expression.Parameter(typeof(IEnumerable<TTable>), "Customers");
var ci = typeof(GroupResult).GetConstructor(new[] { typeof(T), typeof(IEnumerable<TTable>) });
//var ci = typeof(GroupResult).GetConstructor(new[] { typeof(T) });
//var ci = typeof(GroupResult).GetConstructor(new[] { typeof(IEnumerable<TTable>) });
if (ci == null)
return;
var pExp = new ParameterExpression[] { param1, param2 };
var methodExpression = Expression.Lambda<Func<T, IEnumerable<TTable>, GroupResult>>(
Expression.New(ci, new Expression[] { param1, param2 }), //<--- ERROR HERE
pExp
);
Type[] typeArgs = new Type[] { typeof(TTable), typeof(T), typeof(GroupResult) };
Expression[] methodParams = new Expression[] { query.Expression, keySelector, methodExpression };
var resultExpression = Expression.Call(typeof(Queryable), "GroupBy", typeArgs, methodParams);
IQueryable dbQuery = query.Provider.CreateQuery(resultExpression);
if (dbQuery is IQueryable<GroupResult> results)
{
foreach (var result in results)
{
Console.WriteLine("{0,-15}\t{1}", result.Name, result.NumRecords.ToString());
}
}
}
}
When I run this and try and iterate through the results I get the following exception:
System.InvalidOperationException: 'variable 'Customers' of type 'System.Collections.Generic.IEnumerable`1[ExpressionTrees3.Data.Customer]' referenced from scope '', but it is not defined'
which is being caused by the param2 ParameterExpression marked above.
If I use the GroupResult constructor that just takes the key value
var ci = typeof(GroupResult).GetConstructor(new[] { typeof(T) });
and omit the param2 from the Lambda body definition the code works as expected and I get a collection of GroupResult records containing the distinct key values in the Name field (but obviously no summary value).
I've tried everything I can think of and just can't get past this error - it's as though the GroupBy is not actually producing the IEnumerable grouping of Customers for each key.
I suspect I'm missing something really obvious here, but just can't see it. Any help would really very much appreciated.
Please note that I am after answers to this specific issue, I'm not looking for alternative ways of doing a GroupBy (unless there's a fundamental reason why this shouldn't work) - this will be rolled into a much larger solution for building queries and I want to use the same process throughout.
Thanks Svyatoslav - as I thought, it was me being especially dumb!
Your comments, as well as a discussion with a friend who has a lot SQL knowledge pointed me in the right direction.
I had been thinking that the GroupBy expression was going to return an Enumerable for each key value and was trying to pass that into a function ... it always felt wrong, but I just ignored that and kept going.
It's obvious now that I need to tell the GroupBy what to calculate and return (i.e. your comment about aggregation).
So for this easy example, the solution is very simple:
var pExp = new ParameterExpression[] { param1, param2 };
var countTypes = new Type[] { typeof(TTable) };
var countParams = new Expression[] { param2 };
var countExp = Expression.Call(typeof(Enumerable), "Count", countTypes, countParams);
var methodExpression = Expression.Lambda<Func<T, IEnumerable<TTable>, GroupResult>>(
Expression.New(ci, new Expression[] { param1, countExp }),
pExp
);
Just by adding the 'Count' expression into the GroupBy method call it works!
.. and adding a new ctor for GroupResult:
public GroupResult(string name, int count)
{
Name = name;
NumRecords = count;
}
(yep, I feel a bit stupid!)