Search code examples
linqasp.net-coreentity-framework-coreexpression-treesdynamic-expression

Building dynamic expression to compare between two table


I have two tables, table A and table B which have the same scheme.

I want to compare both tables from a different contexts. If the record in table B does not exist in table A, then it will insert it into the A context.

What I have to figure out the single type of table look alike.

var a = context.Set<T>().AsEnumerable();
var b = context2.Set<T>().AsEnumerable();

var NotExistsOnA = b.Where(bRecord => 
                          !a.Any(aRecord =>
                                 bRecord.xxxId.Equals(aRecord.xxxId))).ToList();

How can I create dynamic expression to make the code reusable because there a few table will be used to compare. I am stuck on the how to handle the xxxId part as different table will have different primary key name.

Any help is greatly appreciated.


Solution

  • EF Core metadata services can be used to obtain information about primary key of an entity, like

    var entityType = context.Model.FindEntityType(typeof(T));
    var primaryKey = entityType.FindPrimaryKey();
    

    Then this information can be used to build dynamically equality comparer:

    public static Expression<Func<T, T, bool>> GetPrimaryKeyCompareExpression<T>(this DbContext context)
    {
        var entityType = context.Model.FindEntityType(typeof(T));
        var primaryKey = entityType.FindPrimaryKey();
        var first = Expression.Parameter(typeof(T), "first");
        var second = Expression.Parameter(typeof(T), "second");
        var body = primaryKey.Properties
            .Select(p => Expression.Equal(
                Expression.Property(first, p.PropertyInfo),
                Expression.Property(second, p.PropertyInfo)))
            .Aggregate(Expression.AndAlso); // handles composite PKs
        return Expression.Lambda<Func<T, T, bool>>(body, first, second);
    }
    
    

    which in turn can be compiled to delegate and used inside the LINQ to Objects (since you are using IEnumerables) query, e.g.

    var setA = context.Set<T>().AsEnumerable();
    var setB = context2.Set<T>().AsEnumerable();
    
    var comparePKs = context.GetPrimaryKeyCompareExpression<T>().Compile();
    var notExistsOnA = setB
        .Where(b => !setA.Any(a => comparePKs(a, b)))
        .ToList();
    

    But note that in LINQ to Objects, !Any(...) inside query criteria is inefficient, as it it linear search operation, so the resulting time complexity is quadratic (O(Na * Nb), so you'll have performance issues with bigger data sets.

    So in general it would be better to use join operator. But instead of comparison, it needs key selector, which also can be built similar to the above, but this time emitting Tuple instances (in order to handle composite PKs), e.g.

    public static Expression<Func<T, object>> GetPrimaryKeySelector<T>(this DbContext context)
    {
        var entityType = context.Model.FindEntityType(typeof(T));
        var primaryKey = entityType.FindPrimaryKey();
        var item = Expression.Parameter(typeof(T), "item");
        var body = Expression.Call(
            typeof(Tuple), nameof(Tuple.Create),
            primaryKey.Properties.Select(p => p.ClrType).ToArray(),
            primaryKey.Properties.Select(p => Expression.Property(item, p.PropertyInfo)).ToArray()
        );
        return Expression.Lambda<Func<T, object>>(body, item);
    }
    
    

    and could be used as follows

    var selectPK = context.GetPrimaryKeySelector<T>().Compile();
    var notExistsOnA = setB
        .GroupJoin(setA, selectPK, selectPK, (b, As) => (b, As))
        .Where(r => !r.As.Any())
        .Select(r => r.b)
        .ToList();