Search code examples
haskellfoldunordered

Is the distinction between foldr and foldl important for maps and sets?


(This question applies more generally than to Haskell, but that's the language I'll use to state it.)

The distinction between foldl and foldr seems to depend on the fact that lists are ordered. That is, foldl and foldl collapse a list by applying a function to a starting value and the first or last element, then to the result of the first application and the second or second-to-last element, then the result of the second application to the third or third-to-last element, etc.

But the Data.Set and Data.Map libraries for Haskell define their own versions of foldl and foldr. For instance, for maps they are these:

foldr :: (a -> b -> b) -> b -> Map k a -> b
foldl :: (b -> a -> b) -> b -> Map k a -> b -- I've swapped `a` and `b`
  -- in the second type signature to clarify the correspondence.

Maps and sets aren't ordered. Should I expect performance differences between the versions of foldl and foldr defined for sets and maps, or does foldr f do exactly the same thing as foldl (flip f)?


Solution

  • Maps and sets aren't ordered. Should I expect performance differences between the versions of foldl and foldr defined for sets and maps

    If you refers the source code of Data.Set or Data.Map, you will find that their elements are organized in binary tree:

    data Map k a  = Bin !Size !k a !(Map k a) !(Map k a)
                  | Tip 
    
    data Set a    = Bin !Size !a !(Set a) !(Set a)
                  | Tip
    

    and the foldr of Set:

    foldr f z = go z
      where
        go z' Tip           = z'
        go z' (Bin _ x l r) = go (f x (go z' r)) l
    

    traverse the tree using depth-first search with order right, current, left, so when foldr (+) 0 apply to follow tree:

                       1
                      / \
                     4   2
                          \
                           3
    

    gives,

    4 + (1 + (2 + (3 + 0)))
    

    and foldl

    foldl f z = go z
      where
        go z' Tip           = z'
        go z' (Bin _ x l r) = go (f (go z' l) x) r
    

    with order left, current, right when apply foldl (+) 0 to above tree, give:

    ((((0 + 4) + 1) + 2) + 3)
    

    It show that foldr and foldl of Set equivalent to that apply to the list as:

    foldr (+) 0 [4, 1, 2, 3] = 4 + (1 + (2 + (3 + 0)))
    foldl (+) 0 [4, 1, 2, 3] = ((((0 + 4) + 1) + 2) + 3)
    

    similar situation as Data.Map and don't repeat here.

    Moreover, as we knew, foldr can apply to infinite list (but foldl cannot), for example:

    take 10 $ foldr ((:) . sum) [] $ chunksOf 3 [1..] = [6,15,24,33,42,51,60,69,78,87]
    

    (here chunksOf group the list like [[1,2,3], [4,5,6]...])

    But how about when the tree has a path is infinite like:

                       1
                      / \
                     4   2
                          \
                           3
                            \
                             ...  <- infinite path
    

    Does the foldr of Set behave as list as mentioned above? (I guess the answer is yes, you can check it for youself)

    does foldr f do exactly the same thing as foldl (flip f)?

    No, As the source code as shown above:

    foldr          = ... go (f x (go z' r)) l 
    

    and

    foldl (flip f) = ... go (f x (go z' l)) r
    

    The traversing order of tree is different, but the generic relation between foldr and foldl can be found in this post: Defining foldl in terms of foldr