Search code examples
haskelllazy-evaluationfoldstrict

Haskell - strict vs non-strict with foldl


I have a question concerning the definition of strict vs non-strict. The Haskell wiki-book for Laziness (http://en.wikibooks.org/wiki/Haskell/Laziness), under the section "Black-box strictness analysis", makes the following assertion:

[Assuming a function f which takes a single parameter.] The function f is a strict function if, and only if, f undefined results in an error being printed and the halting of our program.

The wiki contrasts const with id, showing a non-strict and strict function respectively.

My question is that I was under the impression that foldl was evaluated in a non-strict fashion, causing undesirable space-leaks, while foldl' was strict.

However, the above test seems to assert that both foldl and foldl' are strict. That is both functions produce undefined if any of their parameters are undefined:

> Data.List.foldl (+) undefined [1,2,3,4]
    Prelude.undefined
> Data.List.foldl' (+) 0 undefined
    Prelude.undefined
> Data.List.foldl' (+) undefined [1,2,3,4]
    Prelude.undefined
> Data.List.foldl (+) 0 undefined
    Prelude.undefined

Could someone please explain what I'm missing?

Thanks!


Solution

  • A better definition is: A function is said to be strict in an argument if it is undefined in the case that this argument is undefined.

    Let us look at the following definition of foldl:

     foldl f s [] = s
     foldl f s (x:xs) = foldl f (f s x) xs
    

    From this the following can be deduced:

    1. It is not strict in f, because fs value is immaterial in the first equation.
    2. It is not strict in s either, since the list could be non-empty and f could be non-strict in it's first argument (think flip const).
    3. It is strict in the list argument, however, because, no matter what fand s are, this argument must be evaluated to so called weak head normal form. A value of algebraic data type is in WHNF if you can tell the outermost constructor. Put differently, there is no way that foldl can avoid looking if the list argument is an empty list or not. And hence, it must evaluate the list value at least so far. But if that value is undefined, none of the 2 equations is applicable, hence this application of foldl is itself undefined.

    Furthermore, from the fact that foldl is not strict in the accumulator s we can also learn why it is in many cases a bad idea to use foldl: The system sees no reason to actually evaluate the term f s x, since it is passed to a function that is not strict in the corresponding argument. Indeed, it would be a violation of non-strictness principles if it did evaluate it. Hence, depending on the length of the list, a huge thunk is accumulated in the accumulator:

    ((((s `f` x_1) `f` x_2) `f` x_3) ....) `f` x_n)
    

    Now, if f itself is strict in its left argument, like (+), any attempt to evaluate the result of foldl with necessity needs stack space proportional to the length of the list. For, evaluation of f ... x_n, where ... stands for the left hand side must first evaluate that left hand side, and so on, down to f s x_1 and back.

    Hence we have it that

    foldl (flip const) 0 [1..n]
    

    is n, while

    foldl const 0 [1..n]
    

    will stack overflow for large enough n. This is a sure indicator that here are cases where humans are better at computing than machines, as in the second example the length and content of the list are completely irrelevant and most people will see immediately that the result must be 0.