Search code examples
haskelltypesfoldlambda-calculus

Haskell dependent, independent variables in lambda function as applied to foldr


Given

> foldr (+) 5 [1,2,3,4]
15

this second version

foldr (\x n -> x + n) 5 [1,2,3,4]

also returns 15. The first thing I don't understand about the second version is how foldr knows which variable is associated with the accumulator-seed 5 and which with the list variable's elements [1,2,3,4]. In the lambda calculus way, x would seem to be the dependent variable and n the independent variable. So if this

foldr            :: (a -> b -> b) -> b -> [a] -> b
foldr _ z []     =  z
foldr f z (x:xs) =  f x (foldr f z xs)

is foldr and these

:type foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b

:t +d foldr
foldr :: (a -> b -> b) -> b -> [a] -> b

its type declarations, can I glean, deduce the answer to "which is dependent and which is independent" from the type declaration itself? It would seem both examples of foldr above must be doing this

(+) 1 ((+) 2 ((+) 3 ((+) 4 ((+) 5 0))))

I simply guessed the second, lambda function version above, but I don't really understand how it works, whereas the first version with (+) breaks down as shown directly above.

Another example would be this

length' = foldr (const (1+)) 0

where, again, const seems to know to "throw out" the incoming list elements and simply increment, starting with the initial accumulator value. This is the same as

length' = foldr (\_ acc -> 1 + acc) 0

where, again, Haskell knows which of foldr's second and third arguments -- accumulator and list -- to treat as the dependent and independent variable, seemingly by magic. But no, I'm sure the answer lies in the type declaration (which I can't decipher, hence, this post), as well as the lore of lambda calculus, of which I'm a beginner.

Update

I've found this

reverse = foldl (flip (:)) []

and then applying to a list

foldl (flip (:)) [] [1,2,3]
foldl (flip (:)) (1:[]) [2,3]
foldl (flip (:)) (2:1:[]) [3]
foldl (flip (:)) (3:2:1:[]) []
. . .

Here it's obvious that the order is "accumulator" and then list, and flip is flipping the first and second variables, then subjecting them to (:). Again, this

reverse = foldl (\acc x -> x : acc) []
foldl (\acc x -> x : acc) [] [1,2,3]
foldl (\acc x -> x : acc) (1:[]) [1,2,3]
. . .

seems also to rely on order, but in the example from further above

length' = foldr (\_ acc -> 1 + acc) 0
foldr (\_ acc -> 1 + acc) 0 [1,2,3]

how does it know 0 is the accumulator and is bound to acc and not the first (ghost) variable? So as I understand (the first five pages of) lambda calculus, any variable that is "lambda'd," e.g., \x is a dependent variable, and all other non-lambda'd variables are independent. Above, the \_ is associated with [1,2,3] and the acc, ostensibly the independent variable, is 0; hence, order is not dictating assignment. It's as if acc was some keyword that when used always binds to the accumulator, while x is always talking about the incoming list members.

Also, what is the "algebra" in the type definition where t a is transformed to [a]? Is this something from category theory? I see

Data.Foldable.toList :: t a -> [a]

in the Foldable definition. Is that all it is?


Solution

  • By "dependent" you most probably mean bound variable.

    By "independent" you most probably mean free (i.e. not bound) variable.

    There are no free variables in (\x n -> x + n). Both x and n appear to the left of the arrow, ->, so they are named parameters of this lambda function, bound inside its body, to the right of the arrow. Being bound means that each reference to n, say, in the function's body is replaced with the reference to the corresponding argument when this lambda function is indeed applied to its argument(s).

    Similarly both _ and acc are bound in (\_ acc -> 1 + acc)'s body. The fact that the wildcard is used here, is immaterial. We could just have written _we_dont_care_ all the same.

    The parameters in lambda function definition get "assigned" (also called "bound") the values of the arguments in an application, purely positionally. The first argument will be bound / assigned to the first parameter, the second argument - to the second parameter. Then the lambda function's body will be entered and further reduced according to the rules.

    This can be seen a bit differently stating that actually in lambda calculus all functions have only one parameter, and multi-parameter functions are actually nested uni-parameter lambda functions; and that the application is left-associative i.e. nested to the left.

    What this actually means is quite simply

     (\ x n -> x + n)  5  0
    =
     (\ x -> (\ n -> x + n))  5  0
    =
    ((\ x -> (\ n -> x + n))  5)  0
    =
             (\ n -> 5 + n)      0
    =
                     5 + 0
    

    As to how Haskell knows which is which from the type signatures, again, the type variables in the functional types are also positional, with first type variable corresponding to the type of the first expected argument, the second type variable to the second expected argument's type, and so on.

    It is all purely positional.

    Thus, as a matter of purely mechanical and careful substitution, since by the definition of foldr it holds that foldr g 0 [1,2,3] = g 1 (foldr g 0 [2,3]) = ... = g 1 (g 2 (g 3 0)), we have

    foldr (\x n -> x + n) 0 [1,2,3]
    =
          (\x n -> x + n) 1 ( (\x n -> x + n) 2 ( (\x n -> x + n) 3 0 ))
    =
    (\x -> (\n -> x + n)) 1 ( (\x n -> x + n) 2 ( (\x n -> x + n) 3 0 ))
    =
           (\n -> 1 + n)    ( (\x n -> x + n) 2 ( (\x n -> x + n) 3 0 ))
    =
                  1 +      ( (\x n -> x + n)  2 ( (\x n -> x + n) 3 0 ))
    =
                  1 +    ( (\x (\n -> x + n)) 2 ( (\x n -> x + n) 3 0 ))
    =
                  1 +          (\n -> 2 + n)    ( (\x n -> x + n) 3 0 )
    =
                  1 +                (2 +        (\x n -> x + n)  3 0 )
    =
                  1 +                (2 +   (\x -> (\n -> x + n)) 3 0 )
    =
                  1 +                (2 +          (\n -> 3 + n)    0 )
    =
                  1 +                (2 +               ( 3 +       0))
    

    In other words, there is absolutely no difference between (\x n -> x + n) and (+).

    As for that t in foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b, what that means is that given a certain type T a, if instance Foldable T exists, then the type becomes foldr :: (a -> b -> b) -> b -> T a -> b, when it's used with a value of type T a.

    One example is Maybe a and thus foldr (g :: a -> b -> b) (z :: b) :: Maybe a -> b.

    Another example is [] a and thus foldr (g :: a -> b -> b) (z :: b) :: [a] -> b.


    (edit:) So let's focus on lists. What does it mean for a function foo to have that type,

    foo :: (a -> b -> b) -> b -> [a] -> b
    

    ? It means that it expects an argument of type a -> b -> b, i.e. a function, let's call it g, so that

    foo :: (a -> b -> b) -> b -> [a] -> b
    g   ::  a -> b -> b
    -------------------------------------
    foo    g             :: b -> [a] -> b
    

    which is itself a function, expecting of some argument z of type b, so that

    foo :: (a -> b -> b) -> b -> [a] -> b
    g   ::  a -> b -> b
    z   ::                  b
    -------------------------------------
    foo    g                z :: [a] -> b
    

    which is itself a function, expecting of some argument xs of type [a], so that

    foo :: (a -> b -> b) -> b -> [a] -> b
    g   ::  a -> b -> b
    z   ::                  b
    xs  ::                       [a]
    -------------------------------------
    foo    g                z     xs :: b
    

    And what could such function foo g z do, given a list, say, [x] (i.e. x :: a, [x] :: [a])?

    foo g z [x] = b where
    

    We need to produce a b value, but how? Well, g :: a -> b -> b produces a function b -> b given an value of type a. Wait, we have that!

       f = g x     -- f :: b -> b
    

    and what does it help us? Well, we have z :: b, so

       b = f z
    

    And what if it's [] we're given? We don't have any as then at all, but we have a b type value, z -- so instead of the above we'd just define

       b = z
    

    And what if it's [x,y] we're given? We'll do the same f-building trick, twice:

       f1 = g x     -- f1 :: b -> b
       f2 = g y     -- f2 :: b -> b
    

    and to produce b we have many options now: it's z! or maybe, it's f1 z!? or f2 z? But the most general thing we can do, making use of all the data we have access to, is

       b = f1 (f2 z)
    

    for a right-fold (...... or,

       b = f2 (f1 z)
    

    for a left).

    And if we substitute and simplify, we get

    foldr g z [] = z
    foldr g z [x] = g x z         --  =  g x (foldr g z [])
    foldr g z [x,y] = g x (g y z)     --  =  g x (foldr g z [y])
    foldr g z [x,y,w] = g x (g y (g w z))  --  =  g x (foldr g z [y,w])
    

    A pattern emerges.

    Etc., etc., etc.


    A sidenote: b is a bad naming choice, as is usual in Haskell. r would be much much better -- a mnemonic for "recursive result".

    Another mnemonic is the order of g's arguments: a -> r -> r suggests, nay dictates, that a list's element a comes as a first argument; r the recursive result comes second (the Result of Recursively processing the Rest of the input list -- recursively, thus in the same manner); and the overall result is then produced by this "step"-function, g.

    And that's the essence of recursion: recursively process self-similar sub-part(s) of the input structure, and complete the processing by a simple single step:

               a                         a
                 :                         `g`
                   [a]                         r
              -------------             -------------
                       [a]                         r
    
          [a]
         a   [a]
        --------
        (x : xs)       -> r 
             xs -> r
       ----------------------
       ( x  ,      r ) -> r        --- or, equivalently,   x -> r -> r