Search code examples
haskellmonadslazy-evaluation

Laziness of infinite lists under monadic context in Haskell


I'm currently learning haskell for the first time, and I'm having lots of trouble understanding lazy evaluation.

Main problem is that among following scenarios, some behaves lazily and some doesn't, and I can't explain why any of them do so.

Monadic iterate, or iterateM function I'll be using is defined as below.

iterateM :: (Monad m) => (a -> m a) -> a -> m [a]
iterateM f v = f v >>= go
  where go v' = (v' :) <$> iterateM f v'
  • Scenario 1

    something :: IO [Int]
    something = return [1 ..]
    take 5 <$> something -- Fine
    

    Above works with any monads (At least from my experimentation).

  • Scenario 2

    something :: Product [Int]
    something = iterateM return 5
    take 5 <$> something -- Fine
    

    Above works with Identity, Sum, etc.

  • Scenario 3

    something :: IO [Int]
    something = iterateM return 5
    take 5 <$> something -- Infinite Loop !
    

    Above loops infinitely with List, Maybe, etc.

I did some other experiments involving unsafePerformIO to print when certain value is computed, and ended up suspecting that fmap implementation of some functors require right hand side to be computed fully, which breaks laziness. However then why would Scenario 1 not loop infinitely when I clearly applies fmap with take 5 ? And why would that even be a case for IO monad ?

So, why does above scenarios act in such way ?

-------------- Edited --------------

Thanks to all helpful answers, now I understand what is happening. My understanding is actually approached in very conceptual and mathematical way. For any future references I will write down how I understood it.

---------- My Explanation ----------

IO type is equivalent to IO a = World -> (World, a). This can roughly be considered as two separate function, transformation t :: (World -> World), and product p :: (World -> a) just combined into tuple.

Let's think about IO functor, or fmap function for IO type. This has type signature of (a -> b) -> IO a -> IO b. If you interpret this as function taking a function to be fmapped f, a world transformation t, product function p and returns some world transformation, and new product function, it becomes kinda easy to implement. You simply leave transformation same, and composite f with p for product.

fmap :: (a -> b) -> IO a -> World -> (World, b)
fmap f x world =
  let newT = t
      newP = f . p in
    (newT world, newP world)
  where
    t = fst . x
    p = snd . x

Now consider IO monad's μ transform, or join function. It has type signature of IO (IO a) -> IO a, which is equivalent to (World -> (World, IO a)) -> (World -> (World, a)). This looks intimidating, but if you think tuple as two separate functions, you have t1 :: World -> World, p1 :: World -> IO a, and return value of p1 contains two functions t2 :: World -> World, p2 :: World -> (World, a). Now it becomes obvious how join would be implemented for IO monad.

join :: IO (IO a) -> World -> (World, a)
join x world =
  let newT = t2 . t1
      newP = p2 . t1 in
    (newT world, newP world)
  where
    t1 :: World -> World
    t2 :: World -> World
    p2 :: World -> a
    t1 = fst . x
    p1 = snd . x
    t2 = fst . p1
    p2 = snd . p1

With join defined, bind function >>= can be defined as,

=<< :: (a -> IO b) -> IO a -> IO b
=<< = join . fmap

>>= :: IO a -> (a -> IO b) -> IO b
>>= = flip (=<<)

This means it's transformation function is composition of first argument's and return value's of second argument.

We're almost there, let's think about η transform, or return function. It's very simple,

return :: a -> World -> (World, a)
return x world =
  let newT = id
      newP = const x
    (newT world, newP world)

Now consult scenario 3 again.

iterateM :: (Monad m) => (a -> m a) -> a -> m [a]
iterateM f v = f v >>= go
  where go v' = (v' :) <$> iterateM f v'

something :: IO [Int]
something = iterateM return 5
take 5 <$> something

What would be transformation function for something ? Okay, since it's doing bind function, we can think it as composition of t1 from f v, which is id, and t2 from return value of go. Then what is transformation function of return value of go ? Well, we fmap something, so it will be same as right hand side... and there's the infinite loop. Essentially it is infinite composition of ids.

Note that the product function also is infinite compositions of (5 :) and id. It won't be a problem to be infinitely composed towards left side, such as ... . (5 :) . (5 :), but the product is also infinitely composed towards right side with ids.

I think this is what Willem meant by,

IO is a monad that also is exactly defined to run things in a specific order ...

that functions are composed in specific order to ensure IO are performed in specific order. And if you iterate infinitely, it will get composed infinitely.

Additionally, Scenario 3 not working with List or Maybe is rather simple. join function for Maybe monad requires knowing if right hand side is Just or Nothing. It's actually even documented as laziness breakers in Haskell Wiki because of such property. List monad is just me being stupid. List fmap requires to know nth element of right hand side in order to deduce the return value's nth element. And looking closely to the code, it can't deduce any element.

Scenario 1 would of course work because IO monad itself is not breaking laziness of inner value magically, but compositing product and transformation function does.

Of course above code is not how actual haskell's IO is implemented because, you can't really capture World and apply operations in real world. But conceptually it explains.


Solution

  • The main problem is the >>=, depending on what it does. For Product, the Monad instance is implemented as:

    instance Monad Product where
        m >>= k  = k (getProduct m)
    

    thus

    iterateM f v = go (getProduct (Product v))
      where go v' = (v' :) <$> iterateM f v'
    

    We can remote the Product and getProduct functions since these just wrap and unwrap it from the data constructor.

    iterateM f v = go v
      where go v' = (v' :) <$> iterateM f v'
    

    or thus:

    iterateM f v = go
      where go = (v' :) <$> go
    

    it thus will prepend a v' to the list wrapped in the go. For a Product it does not matter what the right hand side is: we know it can only be a product, and we don't need to evaluate the value wrapped in the Product either, since we just leave the function call (here (v' :)) to a value that might still need to be evaluated.

    This thus means that for a Product with an undefined value wrapped in it, or even an undefined product, we get:

    (take 1 . (0:) <$> (undefined :: Product [Int]))
    Product {getProduct = [0]}
    

    Now for IO that is not the case, IO is a monad that also is exactly defined to run things in a specific order: it thus prevents writing a second byte to a file before the first one, because the lazy evaluation somehow evaluated the second one first: it's main function is to guarantee that the order of evaluation, for IO operations only though, remains intact.

    But for IO this, thus means that if you have f v >>= go, it will thus first run f v and then go, and since go eventually is another f v >>= go, you thus keep producing operations to perform before the computation ends, and thus it gets stuck in an infinite loop.