Search code examples
c#linqfunctional-programmingeither

What signature should my type's SelectMany method have?


I have a simple Either type:

public class Left<L, R> : Either<L, R>
{
    public override string ToString()
    {
        return Left.ToString();
    }
    
    public Left(L left)
    {
        Left = left;
        IsLeft = true;
    }
}

public class Right<L, R> : Either<L, R>
{
    public override string ToString()
    {
        return Right.ToString();
    }
    
    public Right(R right)
    {
        Right = right;
        IsRight = true;
    }
}

public class Either<L, R>
{
    public bool IsLeft {get; protected set;}
    public bool IsRight {get; protected set;}
    
    public L Left {get; protected set;}
    public R Right {get; protected set;}
    
    public Either(L left)
    {
        Left = left;
        IsLeft = true;
    }
    
    public Either(R right)
    {
        Right = right;
        IsRight = true;
    }
    
    public Either()
    {
    }
}

I'd like to use this type with LINQ query syntax. I can implement Select, but I can't quite figure out the SelectMany method signature required. More accurately, this works:

var l = new Either<string, int>("no, this is a left");
var r = new Either<string, int>(2);
    
var ret5 =
    from x in l
    from y in r
    select y;

// returns ret5 as a Left, with the expected message

if I use the following implentation:

public static Either<L, RR> SelectMany<L, R, RR>(
    this Either<L, R> either,     // the source "collection"
    Func<R, Either<L, R>> g,      // why R here? why not L, R. can only be 1 argument. Func<TSource, IEnumerable<TCollection>> collectionSelector,
    Func<L, R, RR> f              // a transform that takes the contained items and converts to the transformed right
)
{
    if (either.IsLeft) return new Left<L, RR>(either.Left);
    
    var inner = g(either.Right);
    
    if (inner.IsLeft) return new Left<L, RR>(inner.Left);
    
    return new Right<L, RR>(f(inner.Left, inner.Right));
}

but I can't mentally correlate it with the documented signature (here from Jon Skeet's Edulinq post)

public static IEnumerable<TResult> SelectMany<TSource, TCollection, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, IEnumerable<TCollection>> collectionSelector,
    Func<TSource, TCollection, TResult> resultSelector)

Is there a general way of understanding what the signature of your SelectMany method should be for classes with multiple generic parameters?

EDIT: The LaYumba library has a similar signature for a Transition example class so .. are the complete set of SelectMany overloads documented somewhere - including those for multi-parameter types?

EDIT: very delayed edit, but the reference source details the signatures for IEnumerable<>.SelectMany, but doesn't tell you why they're needed. You can dig through the compiler code to get there but it's a painful journey.


Solution

  • The suggested SelectMany overload only compiles because the only test case doesn't change the types. That is, in

    var ret5 =
        from x in l
        from y in r
        select y;
    

    both l and r have the same type: Either<string, int>. Thus, while being too constrained, the suggested SelectMany overload is actually enough to make the code compile. It's not flexible enough, however, to map between different types of Eithers.

    TL;DR;

    The correct type of SelectMany is:

    public static Either<L, RR> SelectMany<L, R, T, RR>(
        this Either<L, R> source,
        Func<R, Either<L, T>> k,
        Func<R, T, RR> s)
    

    It may help to forget this particular overload for a little while, since it's a C#-specific weirdness that has no equivalent in other languages that I'm aware of (F# and Haskell).

    Bind

    The standard SelectMany method in C# corresponds to what is usually known as monadic bind. For the Either type in the OP, it'd look like this:

    public static Either<L, RR> SelectMany<L, R, RR>(
        this Either<L, R> source,
        Func<R, Either<L, RR>> selector)
    {
        if (source.IsLeft)
            return new Left<L, RR>(source.Left);
    
        return selector(source.Right);
    }
    

    This overload is much easier to deal with, since it doesn't require that weird extra-step function that the other overload takes.

    To be clear, however, the other overload is required to make query syntax light up, so I'll get back to that later.

    If you can implement a SelectMany method like this one, the type forms a monad. Notice that it only maps the right-hand side, though. This may be easier to see if we also add a Select method.

    Functor

    All monads are also functors. You can always implement Select if you have a lawful SelectMany (monadic bind), and the implementation is entirely automatable:

    public static Either<L, RR> Select<L, R, RR>(
        this Either<L, R> source,
        Func<R, RR> selector)
    {
        return source.SelectMany(r => new Right<L, RR>(selector(r)));
    }
    

    Notice that Select only maps R to RR while L stays fixed. Either is actually not just a single functor, but a family of functors - one for each L. For example, Either<string, R> gives rise to a different functor than Either<int, R>. Either is, however, a bifunctor.

    There's only one type to map: R.

    Query syntax

    These two method are enough if you only need method call syntax, but if you also need query syntax, you're going to need that other SelectMany overload. As far as I can tell, if you already have Select and SelectMany, you can always(?) implement the other overload with the same nested lambda express:

    public static Either<L, RR> SelectMany<L, R, T, RR>(
        this Either<L, R> source,
        Func<R, Either<L, T>> k,
        Func<R, T, RR> s)
    {
        return source.SelectMany(x => k(x).Select(y => s(x, y)));
    }
    

    I actually copy-pasted that expression from a SelectMany method for an entirely different monad (State).

    Notice again that L is fixed. It doesn't partake in any mappings, so it's as though it wasn't there. The 'intermediary' type is here called T.

    Monomorphism

    The reason that the method suggested in the OP compiled is that the C# compiler is actually quite forgiving. If you make the generic types less generic than you have to, it still compiles.

    For example, much to my surprise, I last year discovered that you can also use a more constrained Select method to implement monomorphic functors.

    Encapsulation

    All this said, you should consider adding some encapsulation to the types in question. As the code stands, anyone can add a new class that inherits from Either<L, R> and behaves completely erratically.

    You may, for example, instead consider a Church-encoded Either.