Search code examples
haskelliofoldcontinuation-passing

When to use foldr with a continuation as an accumulation function?


There is a technique I've seen a few times with foldr. It involves using a function in place of the accumulator in a foldr. I'm wondering when it is necessary to do this, as opposed to using an accumulator that is just a regular value.

Most people have seen this technique before when using foldr to define foldl:

myFoldl :: forall a b. (b -> a -> b) -> b -> [a] -> b
myFoldl accum nil as = foldr f id as nil
  where
    f :: a -> (b -> b) -> b -> b
    f a continuation b = continuation $ accum b a

Here, the type of the combining function f is not just a -> b -> b like normal, but a -> (b -> b) -> b -> b. It takes not only an a and b, but a continuation (b -> b) that we need to pass the b to in order to get the final b.

I most recently saw an example of using this trick in the book Parallel and Concurrent Programming in Haskell. Here is a link to the source code of the example using this trick. Here is a link to the chapter of the book explaining this example.

I've taken the liberty of simplifying the source code into a similar (but shorter) example. Below is a function that takes a list of Strings, prints out whether each string's length is greater than five, then prints the full list of only the Strings that have a length greater than five:

import Text.Printf

stringsOver5 :: [String] -> IO ()
stringsOver5 strings = foldr f (print . reverse) strings []
  where
    f :: String -> ([String] -> IO ()) -> [String] -> IO ()
    f str continuation strs = do
      let isGreaterThan5 = length str > 5
      printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
      if isGreaterThan5
        then continuation $ str : strs
        else continuation strs

Here's an example of using it from GHCi:

> stringsOver5 ["subdirectory", "bye", "cat", "function"]
Working on "subdirectory", greater than 5? True
Working on "bye", greater than 5? False
Working on "cat", greater than 5? False
Working on "function", greater than 5? True
["subdirectory","function"]

Just like in the myFoldl example, you can see that the combining function f is using the same trick.

However, it occurred to me that this stringsOver5 function could probably be written without this trick:

stringsOver5PlainFoldr :: [String] -> IO ()
stringsOver5PlainFoldr strings = foldr f (pure []) strings >>= print
  where
    f :: String -> IO [String] -> IO [String]
    f str ioStrs = do
      let isGreaterThan5 = length str > 5
      printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
      if isGreaterThan5
        then fmap (str :) ioStrs
        else ioStrs

(Although maybe you could make the argument that IO [String] is a continuation?)


I have two questions regarding this:

  • Is it every absolutely necessary to use this trick of passing a continuation to foldr, instead of using foldr with a normal value as an accumulator? Is there an example of a function that absolutely can't be written using foldr with a normal value? (Aside from foldl and functions like that, of course.)
  • When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using this trick?
  • Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)

Solution

  • I have two questions regarding this:

    For some large value of "two" :-P

    • Is it every absolutely necessary to use this trick of passing a continuation to foldr? Is there an example of a function that absolutely can't be written without this trick? (Aside from foldl and functions like that, of course.)

    No, never. Each foldr invocation can always be replaced by explicit recursion.

    One should use foldr and other well-known library functions when they make the code simpler. When they do not, one should not shoehorn the code so that it fits the foldr pattern.

    There is no shame in using plain recursion, when there is no obvious replacement.

    Compare your code with this, for instance:

    stringsOver5 :: [String] -> IO ()
    stringsOver5 strings = go strings []
      where
      go :: [String] -> [String] -> IO ()
      go []     acc = print (reverse acc)
      go (s:ss) acc = do
          let isGreaterThan5 = length str > 5
          printf "Working on \"%s\", greater than 5? %s\n" str (show isGreaterThan5)
          if isGreaterThan5
            then go ss (s:acc)
            else go ss acc
    
    • When would I want to use this trick in my own code? Is there any example of a function that can be significantly simplified by using this trick?

    In my humble opinion, almost never.

    Personally, I find "calling foldr with four (or more) arguments" to be an anti-pattern in most cases. This is because it is not that shorter than using explicit recursion, and has the potential to be much less readable.

    I would argue that this "idiom" is quite puzzling to any Haskeller who has not seen it before. It is a sort-of an acquired taste, so to speak.

    Perhaps, it could be a good idea to use this style when the continuation functions are meaningful on their own. E.g., when representing lists as difference lists, the concatenation of a regular-list of difference-lists can be quite elegant

    foldr (.) id listOfDLists []
    

    is beautiful, even if the last [] might be puzzling at first.

    • Is there any sort of performance considerations to take into account when using this trick? (Or, well, when not using this trick?)

    Performance should be essentially the same as using explicit recursion. GHC could even generate the exact same code.

    Perhaps using foldr could help GHC fire some fold/build optimization rules, but I'm unsure about the need to do that when using continuations.