Search code examples
haskellstate-monad

Haskell help to understand this State monad code: where is runState defined?


I am new to Haskell and trying to understand monads. I am going thru this code Putting it here for quick reference

newtype State s a = State { runState :: s -> (a,s) }

instance Monad (State s) where
  return a = State $ \s -> (a, s)

  State act >>= k = State $ \s ->
    let (a, s') = act s
    in runState (k a) s'

get :: State s s
get = State $ \s -> (s, s)

put :: s -> State s ()
put s = State $ \_ -> ((), s)

modify :: (s -> s) -> State s ()
modify f = get >>= \x -> put (f x)

evalState :: State s a -> s -> a
evalState act = fst . runState act

execState :: State s a -> s -> s
execState act = snd . runState act

I didn't get where the function runState is defined. It seems its type is declared in newtype State s a = State { runState :: s -> (a,s) }. The most puzzling part is this code compiles without any error.

Where is runState defined? where is its code?


Solution

  • What you seem to be missing is the functionality provided by record syntax. In Haskell, there are multiple ways to define a data type. You are probably familiar with statements like the following:

    data MyType = MyType String Int
    

    where a type MyType is defined, which is a product type (i.e. any value of this type has both the String and Int fields filled). If we want to work with this type, we might want to view the String and Int components separately, so we would define accessor functions for this:

    getLabel :: MyType -> String
    getLabel (MyType s _) = s
    
    getNum :: MyType -> Int
    getNum (MyType _ i) = i
    

    For large data types (with many fields) this can get very tedious, and writing this kind of function is in general very common: often we have a simple type where a constructor wraps the content, and we want a convenient way to access this content in its "own" type (i.e. without the wrapper).

    Because this desire to access fields is so common, Haskell provides record syntax. When you use record syntax, you essentially name the fields inside a data type, and the functions to extract the values in these fields are generated automatically. So we could redefine our data type from earlier:

    data MyType' = MyType' { myLabel :: String, myNum :: Int }
    

    and Haskell would generate the accessor functions myLabel :: MyType' -> String and myNum :: MyType' -> Int automatically. These functions then have the same functionality as the getLabel and getNum functions we defined earlier.

    Taking the State type as an example, we could have defined it as follows:

    newtype State' s a = State' (s -> (a, s))
    

    and written a function to access the contents:

    getStateFun :: State' s a -> (s -> (a, s))
    getStateFun (State' f) = f
    

    which would allow us to remove the State' "wrapping" from around our value. But this is less convenient than using record syntax, which is used in your example: the value stored inside the State wrapper is given the fieldname runState, and Haskell generates the accessor function, which is why the code compiles and runs without problems. runState then basically does the same as getStateFun.

    This is also all quite clearly explained in the section "Record syntax" in this chapter from Learn You A Haskell For Great Good.