Search code examples
scalarecursionfunctional-programminglambda-calculusy-combinator

Understanding the implementation of Y-Combinator


I would like to understand in mint detail please how we managed to get from the lambda calculus expression of Y-combinator :

Y = λf.(λx.f (x x)) (λx.f (x x))

to the following implementation (in Scala) :

def Y[A, B](f: (A => B) => A => B): A => B = (x: A) => f(Y(f))(x)

I am pretty new to functional programming but I have a decent understanding of lambda calculus and how the substitution process works. I am having however some hard time understanding how did we get from the formal expression to the implementation.

Besides, I would love to know how to tell the type and number of arguments of my function and its return type of whatever lambda?


Solution

  • Note that what you wrote is not an implementation of the Y combinator. The "Y combinator" is a specific "fixed-point combinator" in the λ-calculus. A "fixed-point" of a term g is just a point x such that,

    g(x) = x 
    

    A "fixed-point combinator" F is a term that can be used to "produce" fix points. That is, such that,

    g(F(g)) = F(g)
    

    The term Y = λf.(λx.f (x x)) (λx.f (x x)) is one among many that obeys that equation, i.e. it is such that g(Y(g)) = Y(g) in the sense that one term can be reduced to the other. This property means such terms, including Y can be used to "encode recursion" in the calculus.

    Regarding typing note that the Y combinator cannot be typed in the simply typed λ-calculus. Not even in polymorphic calculus of system F. If you try to type it, you get a type of "infinite depth". To type it, you need recursion at the type level. So if you want to extend e.g. simply typed λ-calculus to a small functional programming language you provide Y as a primitive.

    You're not working with λ-calculus though, and your language already comes with recursion. So what you wrote is a straightforward definition for fixed-point "combinator" in Scala. Straightforward because being a fixed-point follows (almost) immediately from the definition.

    Y(f)(x) = f(Y(f))(x)
    

    is true for all x (and it is a pure function) therefore,

    Y(f) = f(Y(f))
    

    which is the equation for fixed-points. Regarding the inference for the type of Y consider the equation Y(f)(x) = f(Y(f))(x) and let,

    f : A => B
    Y : C => D 
    

    since Y : C => D takes f : A => B as an input then,

    C = A => B
    

    since Y f : D is an input of f : A => B then

    D = A
    

    and since the output Y f : D is the same as that of f(Y(f)) : B then

    D = B
    

    thus far we have,

    Y : (A => A) => A 
    

    since Y(f) is applied to x, Y(f) is a function, so

    A = A1 => B1 
    

    for some types A1 and B1 and thus,

    Y : ((A1 => B1) => (A1 => B1)) => A1 => B1