I'm currently learning haskell for the first time, and I'm having lots of trouble understanding lazy evaluation.
Main problem is that among following scenarios, some behaves lazily and some doesn't, and I can't explain why any of them do so.
Monadic iterate, or iterateM
function I'll be using is defined as below.
iterateM :: (Monad m) => (a -> m a) -> a -> m [a]
iterateM f v = f v >>= go
where go v' = (v' :) <$> iterateM f v'
Scenario 1
something :: IO [Int]
something = return [1 ..]
take 5 <$> something -- Fine
Above works with any monads (At least from my experimentation).
Scenario 2
something :: Product [Int]
something = iterateM return 5
take 5 <$> something -- Fine
Above works with Identity, Sum, etc.
Scenario 3
something :: IO [Int]
something = iterateM return 5
take 5 <$> something -- Infinite Loop !
Above loops infinitely with List, Maybe, etc.
I did some other experiments involving unsafePerformIO
to print when certain value is computed, and ended up suspecting that fmap
implementation of some functors require right hand side to be computed fully, which breaks laziness. However then why would Scenario 1 not loop infinitely when I clearly applies fmap
with take 5
? And why would that even be a case for IO
monad ?
So, why does above scenarios act in such way ?
-------------- Edited --------------
Thanks to all helpful answers, now I understand what is happening. My understanding is actually approached in very conceptual and mathematical way. For any future references I will write down how I understood it.
---------- My Explanation ----------
IO type is equivalent to IO a = World -> (World, a)
. This can roughly be considered as two separate function, transformation t :: (World -> World)
, and product p :: (World -> a)
just combined into tuple.
Let's think about IO
functor, or fmap
function for IO
type. This has type signature of (a -> b) -> IO a -> IO b
. If you interpret this as function taking a function to be fmapped f
, a world transformation t
, product function p
and returns some world transformation, and new product function, it becomes kinda easy to implement. You simply leave transformation same, and composite f
with p
for product.
fmap :: (a -> b) -> IO a -> World -> (World, b)
fmap f x world =
let newT = t
newP = f . p in
(newT world, newP world)
where
t = fst . x
p = snd . x
Now consider IO
monad's μ
transform, or join
function. It has type signature of IO (IO a) -> IO a
, which is equivalent to (World -> (World, IO a)) -> (World -> (World, a))
. This looks intimidating, but if you think tuple as two separate functions, you have t1 :: World -> World
, p1 :: World -> IO a
, and return value of p1
contains two functions t2 :: World -> World
, p2 :: World -> (World, a)
. Now it becomes obvious how join
would be implemented for IO
monad.
join :: IO (IO a) -> World -> (World, a)
join x world =
let newT = t2 . t1
newP = p2 . t1 in
(newT world, newP world)
where
t1 :: World -> World
t2 :: World -> World
p2 :: World -> a
t1 = fst . x
p1 = snd . x
t2 = fst . p1
p2 = snd . p1
With join
defined, bind function >>=
can be defined as,
=<< :: (a -> IO b) -> IO a -> IO b
=<< = join . fmap
>>= :: IO a -> (a -> IO b) -> IO b
>>= = flip (=<<)
This means it's transformation function is composition of first argument's and return value's of second argument.
We're almost there, let's think about η
transform, or return
function. It's very simple,
return :: a -> World -> (World, a)
return x world =
let newT = id
newP = const x
(newT world, newP world)
Now consult scenario 3 again.
iterateM :: (Monad m) => (a -> m a) -> a -> m [a]
iterateM f v = f v >>= go
where go v' = (v' :) <$> iterateM f v'
something :: IO [Int]
something = iterateM return 5
take 5 <$> something
What would be transformation function for something
? Okay, since it's doing bind function, we can think it as composition of t1
from f v
, which is id
, and t2
from return value of go
. Then what is transformation function of return value of go
? Well, we fmap
something, so it will be same as right hand side... and there's the infinite loop. Essentially it is infinite composition of id
s.
Note that the product function also is infinite compositions of (5 :)
and id
. It won't be a problem to be infinitely composed towards left side, such as ... . (5 :) . (5 :)
, but the product is also infinitely composed towards right side with id
s.
I think this is what Willem meant by,
IO
is a monad that also is exactly defined to run things in a specific order ...
that functions are composed in specific order to ensure IO are performed in specific order. And if you iterate infinitely, it will get composed infinitely.
Additionally, Scenario 3 not working with List
or Maybe
is rather simple. join
function for Maybe
monad requires knowing if right hand side is Just
or Nothing
. It's actually even documented as laziness breakers in Haskell Wiki because of such property. List
monad is just me being stupid. List
fmap
requires to know nth element of right hand side in order to deduce the return value's nth element. And looking closely to the code, it can't deduce any element.
Scenario 1 would of course work because IO
monad itself is not breaking laziness of inner value magically, but compositing product and transformation function does.
Of course above code is not how actual haskell's IO
is implemented because, you can't really capture World
and apply operations in real world. But conceptually it explains.
The main problem is the >>=
, depending on what it does. For Product
, the Monad
instance is implemented as:
instance Monad Product where
m >>= k = k (getProduct m)
thus
iterateM f v = go (getProduct (Product v))
where go v' = (v' :) <$> iterateM f v'
We can remote the Product
and getProduct
functions since these just wrap and unwrap it from the data constructor.
iterateM f v = go v
where go v' = (v' :) <$> iterateM f v'
or thus:
iterateM f v = go
where go = (v' :) <$> go
it thus will prepend a v'
to the list wrapped in the go
. For a Product
it does not matter what the right hand side is: we know it can only be a product, and we don't need to evaluate the value wrapped in the Product
either, since we just leave the function call (here (v' :)
) to a value that might still need to be evaluated.
This thus means that for a Product
with an undefined value wrapped in it, or even an undefined
product, we get:
(take 1 . (0:) <$> (undefined :: Product [Int]))
Product {getProduct = [0]}
Now for IO
that is not the case, IO
is a monad that also is exactly defined to run things in a specific order: it thus prevents writing a second byte to a file before the first one, because the lazy evaluation somehow evaluated the second one first: it's main function is to guarantee that the order of evaluation, for IO operations only though, remains intact.
But for IO
this, thus means that if you have f v >>= go
, it will thus first run f v
and then go
, and since go
eventually is another f v >>= go
, you thus keep producing operations to perform before the computation ends, and thus it gets stuck in an infinite loop.