Search code examples
c#linqgroupinglazy-evaluationlazy-sequences

how to get empty groups, lazily


I want to group objects by a boolean value, and I need to always get two groups (one for true, one for false), no matter if there are any elements in them.

The usual approach using GroupBy does not work, as it will only generate nonempty groups. Take e.g. this code:

var list = new List<(string, bool)>();
list.Add(("hello", true));
list.Add(("world", false));
var grouping = list.GroupBy(i => i.Item2);
var allTrue = grouping.Last();
var allFalse = grouping.First();

This only works if there is at least one element per boolean value. If we remove one of the Add lines, or even both, allTrue and allFalse will not contain the correct groups. If we remove both, we even get a runtime exception trying to call Last() ("sequence contains no elements").

Note: I want to do this lazily. (Not: Create two empty collections, iterate over the input, fill the collections.)


Solution

  • The .NET platform does not contain a built-in way to produce empty IGroupings. There is no publicly accessible class that implements this interface, so we will have to create one manually:

    class EmptyGrouping<TKey, TElement> : IGrouping<TKey, TElement>
    {
        public TKey Key { get; }
    
        public EmptyGrouping(TKey key) => Key = key;
    
        public IEnumerator<TElement> GetEnumerator()
            => Enumerable.Empty<TElement>().GetEnumerator();
    
        IEnumerator IEnumerable.GetEnumerator()
            => GetEnumerator();
    }
    

    In order to check if all required groupings are available, we will need a way to compare them based on their Key. Below is a simple IEqualityComparer implementation for IGroupings:

    public class GroupingComparerByKey<TKey, TElement>
        : IEqualityComparer<IGrouping<TKey, TElement>>
    {
        public bool Equals(IGrouping<TKey, TElement> x, IGrouping<TKey, TElement> y)
            => EqualityComparer<TKey>.Default.Equals(x.Key, y.Key);
    
        public int GetHashCode(IGrouping<TKey, TElement> obj)
            => obj.Key.GetHashCode();
    }
    

    With this infrastructure in place, we can now create a lazy LINQ operator that appends missing groupings to enumerables. Lets call it EnsureContains:

    public static IEnumerable<IGrouping<TKey, TElement>> EnsureContains<TKey, TElement>(
        this IEnumerable<IGrouping<TKey, TElement>> source, params TKey[] keys)
    {
        return source
            .Union(keys.Select(key => new EmptyGrouping<TKey, TElement>(key)),
                new GroupingComparerByKey<TKey, TElement>());
    }
    

    Usage example:

    var groups = list
        .GroupBy(i => i.Item2)
        .EnsureContains(true, false);
    

    Note: The enumerable produced by the GroupBy operator is lazy, so it is evaluated every time is used. Evaluating this operator is relatively expensive, so it is a good idea to avoid evaluating it more than once.