Search code examples
c#lambdagroup-byexpressionexpression-trees

Build relational grouping expression tree c#


Context:

Using Ag-Grid, users should be able to drag-drop columns they want to group on.

enter image description here

Let's say I have the following model and group by function:

List<OrderModel> orders = new List<OrderModel>()
{
    new OrderModel()
    {
        OrderId = 184214,
        Contact = new ContactModel()
        {
            ContactId = 1000
        }
    }
};

var queryOrders = orders.AsQueryable();

Edit: So people have made me realize that in below question, I was actually focusing on dynamically Select the correct items (which is one of the requirements), I missed out on actually doing the grouping. Therefore some edits have been made to reflect both issues: Grouping and selecting, strongly typed.

In a type-defined way:

Single column

IQueryable<OrderModel> resultQueryable = queryOrders
    .GroupBy(x => x.ExclPrice)
    .Select(x => new OrderModel() { ExclPrice = x.Key.ExclPrice});

Multiple columns

 IQueryable<OrderModel> resultQueryable = queryOrders
            .GroupBy(x => new OrderModel() { Contact = new ContactModel(){ ContactId = x.Contact.ContactId }, ExclPrice = x.ExclPrice})
            .Select(x => new OrderModel() {Contact = new ContactModel() {ContactId = x.Key.Contact.ContactId}, ExclPrice = x.Key.ExclPrice});

However, the last one doesn't work, defining an OrderModel within the GroupBy apparently gives issues when translating it to SQL.

How do I build this GroupBy/ Select using Expressions?

Currently, I have got so far to select the correct items, but no grouping is done yet.

public static IQueryable<TModel> GroupByExpression(List<string> propertyNames, IQueryable<TModel> sequence)
{
    var param = Expression.Parameter(typeof(TModel), "item");
    var body = Expression.New(typeof(TModel).GetConstructors()[0]);
    var bindings = new List<MemberAssignment>();
    foreach (var property in propertyNames)
    {
        var fieldValue = typeof(TModel).GetProperty(property, BindingFlags.Public | BindingFlags.Static | BindingFlags.Instance | BindingFlags.IgnoreCase);

        var fieldValueOriginal = Expression.Property(param, fieldValue ?? throw new InvalidOperationException());

        var memberAssignment = Expression.Bind(fieldValue, fieldValueOriginal);
        bindings.Add(memberAssignment);
    }
    var result = sequence.Select(Expression.Lambda<Func<TModel, TModel>>(Expression.MemberInit(body, bindings), param));
    return result;
}

This works fine until I want to introduce a relationship, so in my example, item.Contact.ContactId.

I have tried to do it this way:

public static IQueryable<TModel> GroupByExpression(List<string> propertyNames, IQueryable<TModel> sequence)
{
    var param = Expression.Parameter(typeof(TModel), "item");
    Expression propertyExp = param;
    var body = Expression.New(typeof(TModel).GetConstructors()[0]);
    var bindings = new List<MemberAssignment>();
    foreach (var property in propertyNames)
    {
        if (property.Contains("."))
        {
            //support nested, relation grouping
            string[] childProperties = property.Split('.');
            var prop = typeof(TModel).GetProperty(childProperties[0], BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.IgnoreCase);
            propertyExp = Expression.MakeMemberAccess(param, prop);
            //loop over the rest of the childs until we have reached the correct property
            for (int i = 1; i < childProperties.Length; i++)
            {
                prop = prop.PropertyType.GetProperty(childProperties[i],
                    BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.IgnoreCase);
                propertyExp = Expression.MakeMemberAccess(propertyExp, prop);

                if (i == childProperties.Length - 1)//last item, this would be the grouping field item
                {
                    var memberAssignment = Expression.Bind(prop, propertyExp);
                    bindings.Add(memberAssignment);
                }
            }
        }
        else
        {
            var fieldValue = typeof(TModel).GetProperty(property, BindingFlags.Public | BindingFlags.Static | BindingFlags.Instance | BindingFlags.IgnoreCase);

            var fieldValueOriginal = Expression.Property(param, fieldValue ?? throw new InvalidOperationException());

            var memberAssignment = Expression.Bind(fieldValue, fieldValueOriginal);
            bindings.Add(memberAssignment);
        }


    }
    var memInitExpress = Expression.MemberInit(body, bindings);
    var result = sequence.Select(Expression.Lambda<Func<TModel, TModel>>(memInitExpress, param));
    return result;
}

Might look promising, but unfortunately, it throws an error at var memInitExpress = Expression.MemberInit(body, bindings);

ArgumentException ''ContactId' is not a member of type 'OrderModel''

So this is how the expression looks like when grouping on multiple columns:

Result of Expression.MemberInit(body, bindings) is: {new OrderModel() {TotalInclPrice = item.TotalInclPrice, OrderId = item.OrderId}}

So the entire expression is: {item => new OrderModel() {TotalInclPrice = item.TotalInclPrice, OrderId = item.OrderId}}

So now it is not so difficult to understand why I get the exception I mentioned, simply because it is using the OrderModel to Select the properties, and ContactId is not in that model. However I am limited and required to stick to IQueryable<OrderModel>, so the question now is how to create the expression to group by ContactId using the same model. I would guess I would actually need to have a expression with this:

Result of Expression.MemberInit(body, bindings) would need to be: {new OrderModel() { Contact = new ContactModel() { ContactId = item.Contact.ContactId} , OrderId = item.OrderId}}. Something like this?

So, I thought let's go back to the basics and do it step by step. Eventually, the for-loop creates the following expression. See my answer how I solve this part, Ivan's answer seems to have solved this in a generic way but I did not test that code yet. However, this does not do the grouping yet, so after applying grouping, these answers might not work anymore.

FYI: The AgGrid can find property relationships by just supplying the column field contact.contactId. So when the data is loaded, it just tries to find that property. I think when above expression is created, it would work within the Grid. I am trying myself now as well how to create sub-MemberInit's, because I think that is the solution in order to successfully do it.


Solution

  • There are two parts in this answer:

    1. Create a GroupBy expression and make sure the same return type is used.
    2. Create a Select expression from the result of the GroupBy expression

    SELECT & GROUPING - non-generic

    So, the complete solution is below, but to give you an idea on how this works, see this piece of code, this is written in a non-generic version. The code for grouping is almost the same, the tiny difference is that a Key. property is added to the beginning.

    public static IQueryable<TModel> GroupByExpression(List<string> propertyNames, IQueryable<TModel> sequence)
    {
        var param = Expression.Parameter(typeof(TModel), "item");
        Expression propertyExp = param;
        var body = Expression.New(typeof(TModel).GetConstructors()[0]);
        var bindings = new List<MemberAssignment>();
        var queryOrders = orders.AsQueryable();
        var orderBindings = new List<MemberAssignment>();
    
        //..more code was here, see question
    
        var orderParam = Expression.Parameter(typeof(OrderModel), "item");
        Expression orderPropertyExp = orderParam;
        var orderPropContact = typeof(OrderModel).GetProperty("Contact");
        orderPropertyExp = Expression.MakeMemberAccess(orderPropertyExp, orderPropContact);
        var orderPropContactId = orderPropContact.PropertyType.GetProperty("ContactId");
        orderPropertyExp = Expression.MakeMemberAccess(orderPropertyExp, orderPropContactId);
    
        var contactBody = Expression.New(typeof(ContactModel).GetConstructors()[0]);
        var contactMemerAssignment = Expression.Bind(orderPropContactId, propertyExp);
        orderBindings.Add(contactMemerAssignment);
        var contactMemberInit = Expression.MemberInit(Expression.New(contactBody, orderBindings);
    
        var orderContactMemberAssignment = Expression.Bind(orderPropContact, contactMemberInit);
    
        var orderMemberInit = Expression.MemberInit(Expression.New(typeof(OrderModel).GetConstructors()[0]), new List<MemberAssignment>() {orderContactMemberAssignment});
    
        //during debugging with the same model, I know TModel is OrderModel, so I can cast it
        //of course this is just a quick hack to verify it is working correctly in AgGrid, and it is!
        return (IQueryable<TModel>)queryOrders.Select(Expression.Lambda<Func<OrderModel, OrderModel>>(orderMemberInit, param));
    }
    

    So now we need to do that in a generic way.

    Grouping:

    To do the grouping in a generic way, I found this amazing post, he deserves a ton of credit to develop that part. However I had to modify it to make sure it also supports sub-relationships. In my example: Order.Contact.ContactId.

    I first wrote this recursive method to correctly get the MemberAssignment bindings.

        /// <summary>
        /// Recursive get the MemberAssignment
        /// </summary>
        /// <param name="param">The initial paramter expression: var param =  Expression.Parameter(typeof(T), "item");</param>
        /// <param name="baseType">The type of the model that is being used</param>
        /// <param name="propEx">Can be equal to 'param' or when already started with the first property, use:  Expression.MakeMemberAccess(param, prop);</param>
        /// <param name="properties">The child properties, so not all the properties in the object, but the sub-properties of one property.</param>
        /// <param name="index">Index to start at</param>
        /// <returns></returns>
        public static MemberAssignment RecursiveSelectBindings(ParameterExpression param, Type baseType, Expression propEx, string[] properties, int index)
        {
            //Get the first property from the list.
            var prop = baseType.GetProperty(properties[index], BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.IgnoreCase);
            var leftProperty = prop;
            Expression selectPropEx = Expression.MakeMemberAccess(propEx, prop);
            //If this is the last property, then bind it and return that Member assignment
            if (properties.Length - 1 == index)
            {
                var memberAssignment = Expression.Bind(prop, selectPropEx);
                return memberAssignment;
            }
    
            //If we have more sub-properties, make sure the sub-properties are correctly generated.
            //Generate a "new Model() { }"
            NewExpression selectSubBody = Expression.New(leftProperty.PropertyType.GetConstructors()[0]);
            //Get the binding of the next property (recursive)
            var getBinding = RecursiveSelectBindings(param, prop.PropertyType, selectPropEx, properties, index + 1);
    
            MemberInitExpression selectSubMemberInit =
                Expression.MemberInit(selectSubBody, new List<MemberAssignment>() { getBinding });
    
            //Finish the binding by generating "new Model() { Property = item.Property.Property } 
            //During debugging the code, it will become clear what is what.
            MemberAssignment selectSubMemberAssignment = Expression.Bind(leftProperty, selectSubMemberInit);
    
            return selectSubMemberAssignment;
        }
    

    Then thereafter, I could modify the Select<T> method of the post I mentioned:

        static Expression Select<T>(this IQueryable<T> source, string[] fields)
        {
            var itemType = typeof(T);
            var groupType = itemType; //itemType.Derive();
            var itemParam = Expression.Parameter(itemType, "x");
    
    
            List<MemberAssignment> bindings = new List<MemberAssignment>();
            foreach (var property in fields)
            {
                Expression propertyExp;
                if (property.Contains("."))
                {
                    string[] childProperties = property.Split('.');
                    var binding = RecursiveSelectBindings(itemParam, itemType, itemParam, childProperties, 0);
                    bindings.Add(binding);
                }
                else
                {
                    var fieldValue = groupType.GetProperty(property, BindingFlags.Public | BindingFlags.Static | BindingFlags.Instance | BindingFlags.IgnoreCase);
                    var fieldValueOriginal = Expression.Property(itemParam, fieldValue ?? throw new InvalidOperationException());
    
                    var memberAssignment = Expression.Bind(fieldValue, fieldValueOriginal);
                    bindings.Add(memberAssignment);
                }
            }
    
            var selector = Expression.MemberInit(Expression.New(groupType), bindings.ToArray());
            return Expression.Lambda(selector, itemParam);
        }
    

    This above code is called by below code (which I didn't modify), but you can see it returns IQueryable<IGrouping<T,T>> type.

        static IQueryable<IGrouping<T, T>> GroupEntitiesBy<T>(this IQueryable<T> source, string[] fields)
        {
            var itemType = typeof(T);
            var method = typeof(Queryable).GetMethods()
                         .Where(m => m.Name == "GroupBy")
                         .Single(m => m.GetParameters().Length == 2)
                         .MakeGenericMethod(itemType, itemType);
    
            var result = method.Invoke(null, new object[] { source, source.Select(fields) });
            return (IQueryable<IGrouping<T, T>>)result;
        }
    

    SELECT

    So we have now done the GroupBy expression, what we now need to do is the Select expression. As I said before it is almost equal to the GroupBy, the only difference is that we have to add Key. infront of each property. This is because the Key is the result of the GroupBy, hence you need to start with this.

        public static IQueryable<TModel> GroupByExpression(List<string> propertyNames, IQueryable<TModel> sequence)
        {
           var grouping = sequence.GroupBy(propertyNames.ToArray());
    
            var selectParam = Expression.Parameter(grouping.ElementType, "item");
            Expression selectPropEx = selectParam;
            var selectBody = Expression.New(typeof(TModel).GetConstructors()[0]);
            var selectBindings = new List<MemberAssignment>();
            foreach (var property in propertyNames)
            {
                var keyProp = "Key." + property;
                //support nested, relation grouping
                string[] childProperties = keyProp.Split('.');
                var prop = grouping.ElementType.GetProperty(childProperties[0], BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.IgnoreCase);
                selectPropEx = Expression.MakeMemberAccess(selectParam, prop);
    
                var binding = PropertyGrouping.RecursiveSelectBindings(selectParam, prop.PropertyType, selectPropEx, childProperties, 1);
                selectBindings.Add(binding);
            }
    
            MemberInitExpression selectMemberInit = Expression.MemberInit(selectBody, selectBindings);
    
            var queryable = grouping
                .Select(Expression.Lambda<Func<IGrouping<TModel, TModel>, TModel>>(selectMemberInit, selectParam));
            return queryable;
    
        }
    

    GetHashCode()

    Unfortunately, this still didn't work, up until I started implementing GetHasCode() and Equals() in each model that is used. During Count() or executing the query by doing .ToList() it will compare all objects to make sure objects are equal (or not) to each other. If they are equal: Same group. But because we generated those models on the fly, it does not have a way to correctly compare those objects based on memory location (by default).

    Luckily, you can generate those 2 methods very easily:

    https://learn.microsoft.com/en-us/visualstudio/ide/reference/generate-equals-gethashcode-methods?view=vs-2019

    Make sure atleast all properties are included that you will use in the table (and can be grouped by).