variables haskell monads state-monad ioref

Difference between State, ST, IORef, and MVar

I am working through Write Yourself a Scheme in 48 Hours (I'm up to about 85hrs) and I've gotten to the part about Adding Variables and Assignments. There is a big conceptual jump in this chapter, and I wish it had been done in two steps with a good refactoring in between rather then jumping at straight to the final solution. Anyway…

I've gotten lost with a number of different classes that seem to serve the same purpose: State, ST, IORef, and MVar. The first three are mentioned in the text, while the last seems to be the favored answer to a lot of StackOverflow questions about the first three. They all seem to carry a state between consecutive invocations.

What are each of these and how do they differ from one another?

In particular these sentences don't make sense:

Instead, we use a feature called state threads, letting Haskell manage the aggregate state for us. This lets us treat mutable variables as we would in any other programming language, using functions to get or set variables.

and

The IORef module lets you use stateful variables within the IO monad.

All this makes the line type ENV = IORef [(String, IORef LispVal)] confusing - why the second IORef? What will break if I'll write type ENV = State [(String, LispVal)] instead?

Solution

The State Monad : a model of mutable state

The State monad is a purely functional environment for programs with state, with a simple API:

Documentation in the mtl package.

The State monad is commonly used when needing state in a single thread of control. It doesn't actually use mutable state in its implementation. Instead, the program is parameterized by the state value (i.e. the state is an additional parameter to all computations). The state only appears to be mutated in a single thread (and cannot be shared between threads).

The ST monad and STRefs

The ST monad is the restricted cousin of the IO monad.

It allows arbitrary mutable state, implemented as actual mutable memory on the machine. The API is made safe in side-effect-free programs, as the rank-2 type parameter prevents values that depend on mutable state from escaping local scope.

It thus allows for controlled mutability in otherwise pure programs.

Commonly used for mutable arrays and other data structures that are mutated, then frozen. It is also very efficient, since the mutable state is "hardware accelerated".

Primary API:

Control.Monad.ST
runST -- start a new memory-effect computation.
And STRefs: pointers to (local) mutable cells.
ST-based arrays (such as vector) are also common.

Think of it as the less dangerous sibling of the IO monad. Or IO, where you can only read and write to memory.

IORef : STRefs in IO

These are STRefs (see above) in the IO monad. They don't have the same safety guarantees as STRefs about locality.

MVars : IORefs with locks

Like STRefs or IORefs, but with a lock attached, for safe concurrent access from multiple threads. IORefs and STRefs are only safe in a multi-threaded setting when using atomicModifyIORef (a compare-and-swap atomic operation). MVars are a more general mechanism for safely sharing mutable state.

Generally, in Haskell, use MVars or TVars (STM-based mutable cells), over STRef or IORef.