haskell monads composition monad-transformers

Why are monads not closed under composition?

While I was learning Composing Types chapter from Haskell Book, I was given tasks to write Functor and Applicative instances for the following type.

newtype Compose f g a = Compose { getCompose :: f (g a) }

I wrote the following definitions

Functor:

fmap f (Compose fga) = Compose $ (fmap . fmap) f fga

Applicative:

(Compose f) <*> (Compose a) = Compose $ (<*>) <$> f <*> a

I learned that composing two Functors or Applicatives gives Functor and Applicative respectively.

The author also explained it is not possible to compose two Monads the same way. So we use Monad Transformers. I just do not want to read Monad Transformers unless I'm clear with why Monads do not compose.

So far I tried to write bind function like this:

Monad:

(>>=) :: Compose f g a -> (a -> Compose f g b) -> Compose f g b
(Compose fga) >>= h = (fmap.fmap) h fga

and of course got this error from GHC

Expected type: Compose f g b

Actual type: f (g (Compose f g b))

If I can strip the outermost f g somehow, the composition gives us a monad right? (I still couldn't figure out how to strip that though)

I tried reading answers from other Stack Overflow questions like this, but all answers are more theoretical or some Math. I still haven't learned why Monads do not compose. Can somebody explain me without using Math?

Solution

I think this is easiest to understand by looking at the join operator:

join :: Monad m => m (m a) -> m a

join is an alternative to >>= for defining a Monad, and is a little easier to reason about. (But now you have an exercise to do: show how to implement >>= from join, and how to implement join from >>=!)

Let's try to make a join operation for Composed f g and see what goes wrong. Our input is essentially a value of type f (g (f (g a))), and we want to produce a value of type f (g a). We also know that we have join for f and g individually, so if we could get a value of type f (f (g (g a))), then we could hit it with fmap join . join to get the f (g a) we wanted.

Now, f (f (g (g a))) isn't so far from f (g (f (g a))). All we really need is a function like this: distribute :: g (f a) -> f (g a). Then we could implement join like this:

join = Compose . fmap join . join . fmap (distribute . fmap getCompose) . getCompose

Note: there are some laws that we would want distribute to satisfy, in order to make sure that the join we get here is lawful.

Ok, so that shows how we can compose two monads if we have a distributive law distribute :: (Monad f, Monad g) => g (f a) -> f (g a). Now, it could be true that every pair of monads has a distributive law. Maybe we just have to think really hard about how to write one down?

Unfortunately there are pairs of monads that don't have a distributive law. So we can answer your original question by producing two monads that definitely don't have a way of turning a g (f a) into an f (g a). These two monads will witness to the fact that monads don't compose in general.

I claim that g = IO and f = Maybe do not have a distributive law

-- Impossible!
distribute :: IO (Maybe a) -> Maybe (IO a)

Let's think about why such a thing should be impossible. The input to this function is an IO action that goes out into the real world and eventually produces Nothing or a Just x. The output of this function is either Nothing, or Just an IO action that, when run, eventually produces x. To produce the Maybe (IO a), we would have to peek into the future and predict what the IO (Maybe a) action is going to do!

In summary:

Monads can compose if there is a distributive law g (f a) -> f (g a). (but see the addendum below)
There are some monads that don't have such a distributive law.
Some monads can compose with each other, but not every pair of monads can compose.

Addendum: "if", but what about "only if"? If all three of F, G, and FG are monads, then you can construct a natural transformation δ : ∀X. GFX -> FGX as the composition of GFη_X : GFX -> GFGX followed by η_{GFGX} : GFGX -> FGFGX and then by μ_X : FGFGX -> FGX. In Haskellese (with explicit type applications for clarity), that would be

delta :: forall f g x. (Monad f, Monad g, Monad (Compose f g))
      => g (f x) -> f (g x)
delta = join' . pure @f . fmap @g (fmap @f (pure @g)) 
  where
    -- join for (f . g), via the `Monad (Compose f g)` instance
    join' :: f (g (f (g x))) -> f (g x)
    join' = getCompose . join @(Compose f g) . fmap Compose . Compose

So if the composition FG is a monad, then you can get a natural transformation with the right shape to be a distributive law. However, there are some extra constraints that fall out of making sure your distributive law satisfies the correct properties, vaguely alluded to above. As always, the n-Category Cafe has the gory details.