Search code examples
haskellfunctional-programmingfoldcurryingpointfree

Point-free implementation of "any" using folds: why do some strategies work and others not?


tl;dr:

The is an exercise from "Haskell Programming from First Principles" that asks the reader to re-implement the any function as myAny (a -> Bool) -> [a] -> Bool using folds, and in a point-free style (where possible).

I was stuck on this and was going to ask a question, but in the process of typing up my original question I came up with the following solution:

myAny = curry $ foldr (||) False . uncurry map

I'm struggling to understand the greater concept behind why this works, and why my other attempts failed.

I want to note that I do understand that all functions in Haskell are curried; I may use phrases like "function x has applied all its arguments" loosely.

Specifically, my questions are:

Question 1:

Why

myAny = foldr (||) False . map

cannot (in the sense of why Haskell/GHC/computation-in-general does not support this) be somehow told to "delay composition until map has all arguments fully applied" without the use of the curry/uncurry pattern.

Question 2:

If the curry/uncurry pattern has a name. It seems like this might have something to do with "combinators", but I am only vaguely familiar with them. Additional resources would be helpful as I continue my journey through HPFFP!

Question 3:

A second strategy I took was to place the "any" function f :: [a] -> Bool within the "fold" function. The furthest I got was

myAny f = foldr ((||) . f) False

I am not certain that I can proceed any further towards point-free with this strategy; I don't think there is any way to "pass" in the argument so that I can do something like

-- This does not work; it is not syntactically valid
myAny = foldr ((||) . ) False

that actually works. I've tried a few different combinations of curry/uncurrying both the foldr function and ((.) (||)), but I can't quite seem to get the types to match up.

I thought for a moment that if I had a way to "curry" the type signature of foldr itself, so it became

foldr` :: Foldable t => ((a, b) -> b) -> b -> t a -> b
foldr` f acc l = foldr (curry f) acc l

that there might have been a path forward, but I was not successful. It seems that there is a hangup with trying to tell foldr that "the function I'm passing you needs more arguments before it is ready".

Is there a way to make this particular strategy work, or am I running into a fundamental limitation somewhere?

Question 4

It seems like both of these problems boil down to a statement along the lines of

There is no way to indicate that a function should not be passed to its caller until "all of its arguments are applied",

noting that, in Haskell and the lambda calculus, all functions take a single argument. Currying/decurrying seems to alleviate that in some cases, but not others. In particular, the first strategy worked because it

  • passed function to transform the data
  • composed with another function that was "ready to go"

while the second strategy tried to pass a partially applied higher-order function to a function that expected a higher order function (albeit of a different order and type), and so could not distinguish whether the function was "ready to go" or not.

Is this intuition correct? Are there any tools (either conceptual or for Haskell itself) that I can use to help me (quickly) see when I'll run up against this sort of struggle? If it is indeed the case that my second strategy will not pan out, I am failing to see a pattern in the strategies that give me some a priori indication that what I am trying to do is impossible.


Long Version

Some background about myself: I am quite new to functional programming, and especially Haskell. I have a decent background in mathematics, some imperative/object-oriented programming experience in python, and a small amount of general computer science knowledge. I am picking up some category theory and feel fairly comfortable with the very basics.

I am working through "Haskell Programming from First Principles". Up until this point (chapter 10 in the version I have), it has covered the basics of the lambda calculus, types, typeclasses, syntax, recursion, list methods, and now folds. I feel fairly comfortable with these both theoretically and in practice (via the exercises).

In the chapter on folds, there is set of exercises to rewrite standard functions via folds, and to do so in a point-free style where possible. The book demonstrates a number of intermediate versions that should help to converge on a final point-free version using folds. The example given is for the "and" function:

myAnd :: [Bool] -> Bool

-- VERSION 1: 
-- direct recursion, not using (&&)
myAnd [] = True
myAnd (x:xs) =
  if x == False
  then False
  else myAnd xs

-- VERSION 2
-- direct recursion, using (&&)
myAnd [] = True
myAnd (x:xs) = x && myAnd xs

-- VERSION 3:
-- fold, not point-free
-- in the folding function
myAnd = foldr
        (\a b ->
          if a == False
          then False
          else b) True

-- VERSION 4
-- fold, both myAnd and the folding
-- function are point-free now
myAnd = foldr (&&) True

The question I am stuck on is doing the same for the "any" function. I've gotten the first 2 versions down:

-- VERSION 1:
-- direct recursion
myAny :: (a -> Bool) -> [a] -> Bool
myAny f [] = False
myAny f (x:xs) =
  if f x == True
  then True
  else myAny f xs

-- Version 2:
-- direct recursion with (||)
myAny f [] = False
myAny f (x:xs) = f x || myAny f xs

But I am stuck with the point-free versions. I see at least two strategies.

The first is applying the a -> Bool function of "myAny" within the fold function. I have three working versions, each getting closer to point free:

myAny1 :: (a -> Bool) -> [a] -> Bool

-- All of these appear to work
myAny1 f l = foldr (\x y -> f x || y) False l
myAny1 f = foldr (\x y -> f x || y) False
myAny1 f = foldr ((||) . f) False
myAny1 = \f -> foldr ((||) . f) False

But I am struggling to eliminate the f argument from the lambda of the last version.

I have not yet convinced myself whether or not this is even possible due to currying, etc. My intuition tells me that it is not; I cannot simply pass a partially applied function such as g = (.) (||) to foldr, because the arguments do not "fall out" from g. I.e.,

(foldr ((.) (||)) False) f l

is not saying that f should first be applied to the "partial composition" as ((.) (||) f); the above expression is actually not even syntactically valid, because foldr is already partially applied in two arguments and expects only a list. Thus, I think that this strategy requires a lambda term to "pass the argument in" and define the folding function, and I don't see a way to avoid that.

After experimenting with the type system and currying/uncurrying, I have a hunch that there is a fundamental blockage in that the type signature for foldr is itself curried. I.e., it is

foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b 

rather than

foldr` :: Foldable t => ((a, b) -> b) -> b -> t a -> b
foldr` f acc l = foldr (curry f) acc l

but I am fairly certain that even using foldr' instead of foldr still prevents me from successfully making this point free -- I still have no way of letting foldr "know" that the function I'm passing it is only "partially partially applied", and I have a feeling that no amount of currying or uncurrying with rectify that.


My second strategy is to map the function f over the list and then apply or, or to keep in the spirit of the exercises, re-implement or via a fold.

My first attempt was:

-- First map `f` over `l`, then apply the fold
-- This works; makes sense so far
myAny2 f l = foldr (||) False $ map f l

My second attempt was:

-- Try to eliminate the argument
-- Does not work; makes sense why
myAny2 f = foldr (||) False $ map f

This doesn't work, because map f is taken as the last argument to foldr. But map f is not a list: it is a function that takes a list and returns a list of the same type. The fix is to change the $ to a .:

-- Works:
myAny2 f = foldr (||) False . map f

Now, I would like to be able to just drop the f from both sides of the above definition, as in

myAny2 = foldr (||) False . map

The rationale being that map consumes our two arguments (f and the list); not passing them in means that the function composition operator . is not evaluated, since function application is higher precedence than composition. Thus, when two arguments are passed to the above version of myAny2, they are first applied to map, evaluated, and then fed into the foldr.

But this understanding is incorrect, because the following error is thrown by GHC:

Ch10ex.hs:28:10: error:
    • Couldn't match type ‘Bool’ with ‘[a] -> Bool’
      Expected type: (a -> Bool) -> [a] -> Bool
        Actual type: (a -> Bool) -> Bool
    • Possible cause: ‘(.)’ is applied to too many arguments
      In the expression: foldr (||) False . map
      In an equation for ‘myAny2’: myAny2 = foldr (||) False . map
    • Relevant bindings include
        myAny2 :: (a -> Bool) -> [a] -> Bool (bound at Ch10ex.hs:28:1)
   |
28 | myAny2 = foldr (||) False . map 
   |          ^^^^^^^^^^^^^^^^^^^^^^

Ch10ex.hs:28:29: error:
    • Couldn't match type ‘[Bool]’ with ‘Bool’
      Expected type: (a -> Bool) -> [a] -> Bool
        Actual type: (a -> Bool) -> [a] -> [Bool]
    • In the second argument of ‘(.)’, namely ‘map’
      In the expression: foldr (||) False . map
      In an equation for ‘myAny2’: myAny2 = foldr (||) False . map
   |
28 | myAny2 = foldr (||) False . map 
   |                             ^^^

Looking at the types and desugaring, I think I can see why this doesn't work. We have:

map :: (a -> b) -> [a] -> [b]
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
foldr (||) False :: Foldable t => t Bool -> Bool
(.) :: (b -> c) -> (a -> b) -> a -> c

So the above actually becomes

(.) (folder (||) False) map

which means that, in the type signature (.) :: (b-> c) -> (a -> b) -> a -> c, we must have that b :: t Bool, c :: Bool, and a ::... well, a doesn't work. It's curried, so it doesn't fit the signature. I think this is why the first error shows up; the currying on "map" means that only a single argument gets applied, and then it tries to compose the function with the partially-applied fold function, and obviously cannot. I think the second error comes from trying to propagate this type checking to the map function; the type checker realizes that if map had the type signature map :: (a -> Bool) -> [a] -> Bool, then everything could work (maybe?).

Nonetheless, if we uncurry map, then we can make the expression type check on its own:

foldr (||) False . uncurry map :: (a -> Bool, [a]) -> Bool

But this is not quite the expected type signature for myAny. So we curry again, and everything works:

-- Final product, working:
myAny2 = curry $ foldr (||) False . uncurry map

All well and good.

Does this curry/uncurry pattern have a name? On a practical level, it seems that what the "uncurry" accomplished was to transform the "2-parameter" function map into a function that took a single 2-tuple as a parameter (so that it could be composed with a function that needed a "single parameter"), and then re-curried to make the given type signature fit.


Solution

  • The reason your first example doesn’t work is that it ends up trying to pass map f as an argument to foldr:

    myAny = foldr (||) False . map
    =
    myAny f = (foldr (||) False . map) f
    =
    myAny f = foldr (||) False (map f)
    

    You need another level of composition to “map under” the extra argument. I prefer to write this with fmap to evoke that mnemonic, but it’s equivalent to (.). (This is using instance Functor ((->) x).)

    myAny = fmap (foldr (||) False) . map
    =
    myAny f = (fmap (foldr (||) False) . map) f
    =
    myAny f = fmap (foldr (||) False) (map f)
    =
    myAny f = foldr (||) False . map f
    =
    myAny f x = (foldr (||) False . map f) x
    =
    myAny f x = foldr (||) False (map f x)
    

    The curry/uncurry pattern doesn’t really have a name that I know of, but the general strategy is related to the principles of programming with Arrows. In this case, instead of using fmap to “dip under” an argument, you’re using uncurry to group two arguments together into one so you can make a linear pipeline with (.), then curry ungroups them again:

                                      map  ::  (a -> b) ->    [a]  -> [b]
                              uncurry map  :: ( a -> b,       [a]) -> [b]
           foldr (||) False                ::                         [Bool] -> Bool
           foldr (||) False . uncurry map  :: ( a -> Bool,    [a])           -> Bool
    curry (foldr (||) False . uncurry map) ::  (a -> Bool) -> [a]            -> Bool
    

    You can eliminate the f in this code:

    myAny f = foldr ((||) . f) False
    

    By rewriting to place the f in a position where you can eta-reduce:

    myAny f = foldr ((||) . f) False
    =
    myAny f = flip foldr False ((||) . f)
    =
    myAny f = flip foldr False ((.) (||) f)
    =
    myAny f = flip foldr False (fmap (||) f)
    =
    myAny f = (flip foldr False . fmap (||)) f
    =
    myAny = flip foldr False . fmap (||)
    

    I am failing to see a pattern in the strategies that give me some a priori indication that what I am trying to do is impossible.

    It’s always possible to write an expression in point-free form. pointfree.io will give you a bad pointfree version of anything.

    If you want something you wouldn’t be embarrassed to be seen with in public, then you have to consider the dataflow structure, use higher-level combinators like those in Control.Arrow, and ideally use lots of intermediate functions with clear names. The mantra is “name code, not data”. The point of pointfree code (ha) is to make code easier to follow, by forcing you to fix the “spaghetti dataflow” that variables allow you to write (by letting you hook up anything to anything else). If you just throw away the variable names without improving the structure, the result will be harder to read.