I have two version of grouping by a list of items
List<m_addtlallowsetup> xlist_distincted = xlist_addtlallowsetups.DistinctBy(p => new { p.setupcode, p.allowcode }).OrderBy(y => y.setupcode).ThenBy(z => z.allowcode).ToList();
and groupby
List <m_addtlallowsetup> grouped = xlist_addtlallowsetups.GroupBy(p => new { p.setupcode, p.allowcode }).Select(grp => grp.First()).OrderBy(y => y.setupcode).ThenBy(z => z.allowcode).ToList();
these two seemed to me that they are just the same, but there's gotta be a layman's explanation of their difference, their performance and disadvantages
Let's review the MoreLinq
APIs first, following is the code for DistinctBy
:
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
return _(); IEnumerable<TSource> _()
{
var knownKeys = new HashSet<TKey>(comparer);
foreach (var element in source)
{
if (knownKeys.Add(keySelector(element)))
yield return element;
}
}
}
HashSet<T>
internally it just checks the first match and returns the first element of Type T
matching the Key, rest are all ignored, since Key is already added to the HashSetFunc<TSource, TKey> keySelector
public static IEnumerable<IGrouping<TKey, TElement>> GroupBy<TSource, TKey, TElement>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector) {
return new GroupedEnumerable<TSource, TKey, TElement>(source, keySelector, elementSelector, null);
}
internal class GroupedEnumerable<TSource, TKey, TElement> : IEnumerable<IGrouping<TKey, TElement>>
{
IEnumerable<TSource> source;
Func<TSource, TKey> keySelector;
Func<TSource, TElement> elementSelector;
IEqualityComparer<TKey> comparer;
public GroupedEnumerable(IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer) {
if (source == null) throw Error.ArgumentNull("source");
if (keySelector == null) throw Error.ArgumentNull("keySelector");
if (elementSelector == null) throw Error.ArgumentNull("elementSelector");
this.source = source;
this.keySelector = keySelector;
this.elementSelector = elementSelector;
this.comparer = comparer;
}
public IEnumerator<IGrouping<TKey, TElement>> GetEnumerator() {
return Lookup<TKey, TElement>.Create<TSource>(source, keySelector, elementSelector, comparer).GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerator();
}
}
LookUp
data structure to group all the data for a given KeyMoreLinq - DistinctBy
achieves a small subset of what Enumerable - GroupBy
can achieve. In case your use case is specific, use the More Linq APIMoreLinq - DistinctBy
would be faster, since unlike Enumerable - GroupBy
, DistinctBy
doesn't first aggregate all data and then select first for each unique Key, MoreLinq API just ignores data beyond first recordMoreLinq
is a better choice.This is a classic case in Linq, where more than one API can provide same result but we need to be wary of the cost factor, since GroupBy
here is designed for much wider task than what you are expecting from DistinctBy