How does the state monad work? (without code explanation)

For the last number of months I've been taking some spare time here and there to read up on Monads. I haven't worked with a functional language since my University days. So I don't really remember Haskell, and certainly have no idea about Scalaz.

It was a trying time learning about onions, burritos, and sandwiches while trying to correlate these foods to wads of Haskell code. Luckily I stumbled upon two key writeups which gave me the a-ha moment: monads in pictures, and another guy from an imperative coding background who simply wrote the equivalent of:

a -> b becomes m[a] -> m[b] (functor)
a -> m[b] becomes m[a] -> m[b] (monad)
and applicative is just a "function with context"

Together with the simplest sentence describing bind's purpose, many categories of endofunctors exploded and fizzled into simplicity.

Identity, List, Maybe and others suddenly made sense. Fueled with this knowledge I started to look toward trying some TMP in C++ using monadic types to see how it would pan out.

Very soon I wanted to propagate state. I think I've now read more articles on the state monad and monad transformers than I did for monads to begin with but for the life of me I can't figure them. Many keyboards have worn out I'm sure with all the words typed on these topics answering people like me.

But I ask you to suffer one more.

a -> s -> (b, s)

Function takes a value and some state, returns a new value and (possibly) modified state. Bind would clearly need to propagate both the return value and state into the next function.

My issue is this: monad's work because they impose a structure. Sticking to the structure and obeying the laws leads to funtimes and rainbows. However a -> s -> (b, s) does not look like the monadic a -> m[b].

Even if 'a' is applied, it's still s -> (b, s). The difference being that the latter function takes a (monadic?) STATE and returns state WITH a value. Whereas the regular form is: takes a VALUE and returns a WRAPPED value.

Both the parameter(s) going in vary as well as the form of the return type. Yet many articles are saying this clearly looks like a monad. It sure doesn't to me.

If I were to relax the view I've been told has to be held: Instead of bind applying m[a] to a -> m[b], it takes whatever parameter makes sense for the monad and applies it to whatever function signature makes sense... then I could see it working.

As in that case I could simply say "oh this monad binds State[s] AND 'a' to a function like a -> State[s] -> (State[s], b)". It could then expect a tuple return and unpack that into the arguments for the next function.

But somehow I suspect there IS a way of making the state monad work like all the others - including expecting the form a -> m[b] somehow (but then how does it thread the state through?). I suspect this can be done because I believe writing Monad Transformers will rely on these function signatures (forms?) being consistent.

Even if you don't have time to type out a big response, a link to an article more from an imperative programmers perspective would be a godsend.

(And apologies for any incorrect use of terminology - I'm kind of picking that up along the way too)

Solution

I think that the monad instance under discussion does preserve State structure and that the tuple is throwing you for a loop when it should not.

If we take a step back and ask what signature should bind have in order to match up with our monad expectations, we get this:

(>>=) :: State s a -> (a -> State s b) -> State s b
someState >>= someFuncOfA = ...

and I'll put extra brackets around the monadic value constructor, which remember is State s and not merely State since a concrete type cannot be made an instance of Monad, only a type constructor of one argument:

(>>=) :: [State s] a -> (a -> [State s] b) -> [State s] b
someState >>= someFuncOfA = ...

where [State s] will be our special m from all the monad stuff.

Since the newtype for State is

newtype State s a = State s -> (a, s)

we know that whatever it means to value-construct a State value, the parameter that the value takes is a function s -> (a, s).

So back up in our incomplete bind definition:

(>>=) :: [State s] a -> (a -> [State s] b) -> [State s] b
someState >>= someFuncOfA = ...

we know that the ... is going to have to be State $ (someNewFunc) cause that's how State gets value-constructed.

First observation: the tuple doesn't have much to do with this. It is the function someNewFunc that, so long as it's type matches what's needed to value-construct State, maintains our desired monadic structure. The fact that that function happens to need to have type s -> (a, s) is pretty much just an implementation choice. The signature could be a lot of different things, and since there's not really a difference between tuples and value constructors, I'm sure you could first make a special data type whose value constructor acted like a two-tuple for this purpose and made the signatures seem artificially nicer.

Second observation: State does not have an exported 'State' value constructor, at least not from Control.Monad.State, so you use the helper function

state :: (s -> (a, s)) -> State s a

which again takes a function as its first parameter and does the value-constructing behind the scenes.

So we know that inside of the bind definition, we're not going to be able to write

someState >>= someFuncOfA = State $ (someFunc)

instead it will need to be

someState >>= someFuncOfA = state $ (someFunc)

and state $ (someFunc) when evaluated leads to type State s a because someFunc is forced to have type (s -> (a,s)) and will in fact use runState to accomplish this.

This may not be a very good write-up. It was my attempt to translate the relevant portions of the FP Complete State Monad tutorial into something that addressed this particular question. Perhaps reading that will provide a better description.