Search code examples
c#sql-serverlistentity-framework-coreexcept

Use ExceptBy set operation with list of Keyselectors instead of single selector in C#


I am working on a project where I have an Excel sheet with some data. I am converting it into list and mapping it into another list which is derived from table in a database using entity framework core. I want to compare the two lists and store the records present in first list (database) but not in second list (Excel) into another list object. But the problem here is I don't have single field to relay on i.e. I don't have primary key so I created a candidate key with 4 fields (this number may change based on the list I send). I used ExceptBy but it works with one primary key as keyselector, I have to change it to accept multiple key selectors. And number of records in my list vary from 1,00,000 to 1,000,000. I don't know how to pass candidate key to exceptby function. This is what I tried.

public static IEnumerable<TSource> ExceptBy<TSource>(
    this IEnumerable<TSource> source,
    IEnumerable<TSource> other,
    List<string> keyNames,
    IEqualityComparer<string> comparer = null)
{
    comparer = comparer ?? EqualityComparer<string>.Default;

    var sets = keyNames.Select(keyName => new HashSet<string>(other.Select(item => 
    GetPropertyValue(item, keyName)), comparer)).ToList();

    foreach (var item in source)
    {
        var keys = keyNames.Select(keyName => GetPropertyValue(item, 
        keyName)).ToList();

        if (sets.All(set => set.Add(keys[sets.IndexOf(set)])))
        {
            yield return item;
        }
    }
}

private static string GetPropertyValue<TSource>(TSource source, string propertyName)
{
    var property = typeof(TSource).GetProperty(propertyName);
    return property != null ? property.GetValue(source)?.ToString() : null;
}

And I use it like this

List primaryKeyName = new List();

var valuestobedeleted = mappedvalue.ExceptBy(users, primaryKeyName, StringComparer.OrdinalIgnoreCase);

I don't get proper output here. How to solve this. Is there any other way.

Thank you!!!


Solution

  • This is simple solution, which works using concatenation of key fields with | separator and string.Format method. Concatenation made using compiled function, which should be fast enough.

    public static class CollectionUtils
    {
        public static IEnumerable<TSource> ExceptBy<TSource>(
            this IEnumerable<TSource>  source,
            IEnumerable<TSource>       other,
            IEnumerable<string>        keyNames,
            IEqualityComparer<string>? comparer = null)
        {
            var keyExtractor = GenerateKeyFunction<TSource>(keyNames);
    
            var query = source
                .GroupJoin(other, s => keyExtractor(s), o => keyExtractor(o), (s, gj) => new { s, gj }, comparer)
                .Where(t => !t.gj.Any())
                .Select(t => t.s);
    
            return query;
        }
    
        static MethodInfo _formatMethodInfo =
            typeof(string).GetMethod(nameof(string.Format), new Type[] { typeof(string), typeof(object[]) }) ??
            throw new InvalidOperationException("Format method not found");
    
        static Func<T, string> GenerateKeyFunction<T>(IEnumerable<string> propertyNames)
        {
            var entityParam = Expression.Parameter(typeof(T), "e");
    
            var args = propertyNames.Select(p => (Expression)Expression.PropertyOrField(entityParam, p))
                .ToList();
    
            var formatStr = string.Join("|", args.Select((_, idx) => $"{{{idx}}}"));
    
            var argsParam = Expression.NewArrayInit(typeof(object), args);
    
            var body     = Expression.Call(_formatMethodInfo, Expression.Constant(formatStr), argsParam);
            var lambda   = Expression.Lambda<Func<T, string>>(body, entityParam);
            var compiled = lambda.Compile();
    
            return compiled;
        }
    }