Can anybody explain GHC's definition of IO?

The title is pretty self-descriptive, but there's one part that caught my attention:

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

Stripping the newtype, we get:

State# RealWorld -> (# State# RealWorld, a #)

I don't know what State# stands for. Can we replace it with State like this:

State RealWorld -> (State RealWorld, a)

And can that be expressed as this, then?

State (State RealWorld) a

This particular construct caught my attention.

I know that conceptually,

type IO a  =  RealWorld -> (a, RealWorld)

And @R.MartinhoFernandes told me that I can actually think about that implementation as ST RealWorld a, but I'm just curious why the particular GHC version is written like it is.

Solution

It's probably best not to think too deeply about GHC's implementation of IO, because that implementation is weird and shady and works most of the time by compiler magic and luck. The broken model that GHC uses is that an IO action is a function from the state of the entire real world to a value paired with a new state of the entire real world. For humorous proof that this is a strange model, see the acme-realworld package.

The way this "works": Unless you import weird modules whose names start with GHC., you can't ever touch any of these State# things. You're only given access to functions that deal in IO or ST and that ensure the State# can't be duplicated or ignored. This State# is threaded through the program, which ensures that the I/O primitives actually get called in the proper order. Since this is all for pretend, the State# is not a normal value at all—it has a width of 0, taking 0 bits.

Why does State# take a type argument? That's a much prettier bit of magic. ST uses that to force polymorphism needed to keep state threads separate. For IO, it's used with the special magic RealWorld type argument.