function haskell recursion arguments operator-precedence

Haskell: arrow precedence with function arguments

I'm a relatively experienced Haskell programmer with a few hours of experience, so the answer might be obvious.

After watching A taste of Haskell, I got lost when Simon explained how the append (++) function really works with its arguments.

So, here's the part where he talks about this.

First, he says that (++) :: [a] -> [a] -> [a] can be understood as a function which gets two lists as arguments, and returns a list after the last arrow). However, he adds that actually, something like this happens: (++) :: [a] -> ([a] -> [a]), the function takes only one argument and returns a function.

I'm not sure to understand how the returned function closure gets the first list as it expects one argument as well.

On the next slide of the presentation, we have the following implementation:

(++) :: [a] -> [a] -> [a]

[]     ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

If I think that (++) receives two arguments and return a list, this piece of code along with the recursion is clear enough.

If we consider that (++) receives only one argument and returns a list, where does ys come from? Where is the returned function ?

Solution

The trick to understanding this is that all haskell functions only take 1 argument at most, it's just that the implicit parentheses in the type signature and syntax sugar make it appear as if there are more arguments. To use ++ as an example, the following transformations are all equivalent

xs ++ ys = ...
(++) xs ys = ...
(++) xs = \ys -> ...
(++) = \xs -> (\ys -> ...)
(++) = \xs ys -> ...

Another quick example:

doubleList :: [Int] -> [Int]
doubleList = map (*2)

Here we have a function of one argument doubleList without any explicit arguments. It would have been equivalent to write

doubleList x = map (*2) x

Or any of the following

doubleList = \x -> map (*2) x
doubleList = \x -> map (\y -> y * 2) x
doubleList x = map (\y -> y * 2) x
doubleList = map (\y -> y * 2)

The first definition of doubleList is written in what is commonly called point-free notation, so called because in the mathematical theory backing it the arguments are referred to as "points", so point-free is "without arguments".

A more complex example:

func = \x y z -> x * y + z
func = \x -> \y z -> x * y + z
func x = \y z -> x * y + z
func x = \y -> \z -> x * y + z
func x y = \z -> x * y + z
func x y z = x * y + z

Now if we wanted to completely remove all references to the arguments we can make use of the . operator which performs function composition:

func x y z = (+) (x * y) z    -- Make the + prefix
func x y = (+) (x * y)        -- Now z becomes implicit
func x y = (+) ((*) x y)      -- Make the * prefix
func x y = ((+) . ((*) x)) y  -- Rewrite using composition
func x = (+) . ((*) x)        -- Now y becomes implicit
func x = (.) (+) ((*) x)      -- Make the . prefix
func x = ((.) (+)) ((*) x)    -- Make implicit parens explicit
func x = (((.) (+)) . (*)) x  -- Rewrite using composition
func = ((.) (+)) . (*)        -- Now x becomes implicit
func = (.) ((.) (+)) (*)      -- Make the . prefix

So as you can see there are lots of different ways to write a particular function with a varying number of explicit "arguments", some of which are very readable (i.e. func x y z = x * y + z) and some which are just a jumble of symbols with little meaning (i.e. func = (.) ((.) (+)) (*))