Search code examples
haskellgadt

Type-safe Flow (State Machine)


I am trying to create a type-safe Question-Answer flow in Haskell. I am modeling QnA as a directed graph, similar to a FSM.

Each node in the graph represent a question:

data Node s a s' = Node {
  question :: Question a,
  process :: s -> a -> s'
}

s is the input state, a is the answer to the question and s' is the output state. Nodes depend on the input state s, meaning that for processing the answer we have to be a in a particular state before.

Question a represent a simple question / answer producing an answer of type a.

By type-safe I mean, for example given a node Node2 :: si -> a -> s2, if si depends on s1 then all the paths ending with Node2 must be passing through a node that produces s1 first. (If s1 == si then all predecessors of Node2 must be producing s1).

An Example

QnA: In an online shopping website, we need to ask user's body size and favorite color.

  1. e1: ask user if they know their size. If yes then go to e2 otherwise go to e3
  2. e2: ask user's size and go to ef to ask the color.
  3. e3: (user doesn't know their size), ask user's weight and go to e4.
  4. e4: (after e3) ask user's height and calculate their size and go to ef.
  5. ef: ask user's favorite color and finish the flow with the Final result.

In my model, Edges connect Nodes to each other:

data Edge s sf where
  Edge  :: EdgeId -> Node s a s' -> (s' -> a -> Edge s' sf) -> Edge s sf
  Final :: EdgeId -> Node s a s' -> (s' -> a -> sf) -> Edge s sf

sf is the final result of the QnA, that here is: (Bool, Size, Color).

The QnA state at each moment can be represented by a tuple: (s, EdgeId). This state is serializable and we should be able to continue a QnA by just knowing this state.

saveState :: (Show s) => (s, Edge s sf) -> String
saveState (s, Edge eid n _) = show (s, eid)

getEdge :: EdgeId -> Edge s sf
getEdge = undefined --TODO

respond :: s -> Edge s sf -> Input -> Either sf (s', Edge s' sf)
respond s (Edge ...) input = Right (s', Edge ...)
respond s (Final ...) input = Left s' -- Final state

-- state = serialized (s, EdgeId)
-- input = user's answer to the current question
main' :: String -> Input -> Either sf (s', Edge s' sf)
main' state input =
  let (s, eid) = read state :: ((), EdgeId) --TODO
      edge = getEdge eid
  in respond s input edge

Flow for determining user's dress size

Full code:

{-# LANGUAGE GADTs, RankNTypes, TupleSections #-}

type Input = String
type Prompt = String
type Color = String
type Size = Int
type Weight = Int
type Height = Int

data Question a = Question {
  prompt :: Prompt,
  answer :: Input -> a
}

-- some questions 
doYouKnowYourSizeQ :: Question Bool
doYouKnowYourSizeQ = Question "Do you know your size?" read

whatIsYourSizeQ :: Question Size
whatIsYourSizeQ = Question "What is your size?" read

whatIsYourWeightQ :: Question Weight
whatIsYourWeightQ = Question "What is your weight?" read

whatIsYourHeightQ :: Question Height
whatIsYourHeightQ = Question "What is your height?" read

whatIsYourFavColorQ :: Question Color
whatIsYourFavColorQ = Question "What is your fav color?" id

-- Node and Edge

data Node s a s' = Node {
  question :: Question a,
  process :: s -> a -> s'
}

data Edge s sf where
  Edge  :: EdgeId -> Node s a s' -> (s' -> a -> Edge s' sf) -> Edge s sf
  Final :: EdgeId -> Node s a s' -> (s' -> a -> sf) -> Edge s sf

data EdgeId = E1 | E2 | E3 | E4 | Ef deriving (Read, Show)

-- nodes

n1 :: Node () Bool Bool
n1 = Node doYouKnowYourSizeQ (const id)

n2 :: Node Bool Size (Bool, Size)
n2 = Node whatIsYourSizeQ (,)

n3 :: Node Bool Weight (Bool, Weight)
n3 = Node whatIsYourWeightQ (,)

n4 :: Node (Bool, Weight) Height (Bool, Size)
n4 = Node whatIsYourHeightQ (\ (b, w) h -> (b, w * h))

n5 :: Node (Bool, Size) Color (Bool, Size, Color)
n5 = Node whatIsYourFavColorQ (\ (b, i) c -> (b, i, c))


-- type-safe edges

e1 = Edge E1 n1 (const $ \ b -> if b then e2 else e3)
e2 = Edge E2 n2 (const $ const ef)
e3 = Edge E3 n3 (const $ const e4)
e4 = Edge E4 n4 (const $ const ef)
ef = Final Ef n5 const


ask :: Edge s sf -> Prompt
ask (Edge _ n _) = prompt $ question n
ask (Final _ n _) = prompt $ question n

respond :: s -> Edge s sf -> Input -> Either sf (s', Edge s' sf)
respond s (Edge _ n f) i =
  let a  = (answer $ question n) i
      s' = process n s a
      n' = f s' a
  in Right undefined --TODO n'

respond s (Final _ n f) i =
  let a  = (answer $ question n) i
      s' = process n s a
  in Left undefined --TODO s'


-- User Interaction:

saveState :: (Show s) => (s, Edge s sf) -> String
saveState (s, Edge eid n _) = show (s, eid)

getEdge :: EdgeId -> Edge s sf
getEdge = undefined --TODO

-- state = serialized (s, EdgeId) (where getEdge :: EdgeId -> Edge s sf)
-- input = user's answer to the current question
main' :: String -> Input -> Either sf (s', Edge s' sf)
main' state input =
  let (s, eid) = undefined -- read state --TODO
      edge = getEdge eid
  in respond s edge input

It's important for me to keep the edges type-safe. Meaning for instance incorrectly linking e2 to e3 must be a type error: e2 = Edge E2 n2 (const $ const ef) is fine by e2 = Edge E2 n2 (const $ const e3) must be an error.

I have indicated my questions with --TOOD:

  • Given my criteria for keeping edges type-safe, Edge s sf must have an input type variable (s) then how can I create getEdge :: EdgeId -> Edge s sf function?

  • How can I create the respond function that given the current state s and current edge Edge s sf, will return either the final state (if current edge is Final) or the next state and the next edge (s', Edge s' sf)?

My design of Node s a s' and Edge s sf might be simply wrong. I don't have to stick with it.


Solution

  • For the purposes of having a simple example to explain, I'm going to show you a solution which doesn't have natural support for suspending, persisting and resuming computations. At the end I'll give you the gist of how to add that in - hopefully you'll be able to figure out the nitty gritty of it yourself.


    Here's the so-called indexed state monad:

    newtype IStateT m i o a = IStateT { runIState :: i -> m (o, a) }
    

    IStateT is like a regular state monad transformer, except that the type of the implicit state is allowed to change throughout the course of a computation. Sequencing actions in the indexed state monad requires that the output state of one action matches the input state of the next. This sort of domino-like sequencing is what Atkey's parameterised monad (or indexed monad) is for.

    class IMonad m where
        ireturn :: a -> m i i a
        (>>>=) :: m i j a -> (a -> m j k b) -> m i k b
    
    (>>>) :: IMonad m => m i j a -> m j k b -> m i k b
    mx >>> my = mx >>>= \_ -> my
    

    IMonad is the class of monad-like things which describe a path through an indexed graph. The type of (>>>=) says "If you have a computation which goes from i to j and a computation from j to k, I can join them up to give you a computation from i to k".

    We'll also need to lift computations from classical monads into indexed monads:

    class IMonadTrans t where
        ilift :: Monad m => m a -> t m i i a
    

    Note that the code for IStateT is just the same as the code for the regular state monad - it's just the types that have got smarter.

    iget :: Monad m => IStateT m s s s
    iget = IStateT $ \s -> return (s, s)
    
    iput :: Monad m => o -> IStateT m i o ()
    iput x = IStateT $ \_ -> return (x, ())
    
    imodify :: Monad m => (i -> o) -> IStateT m i o ()
    imodify f = IStateT $ \s -> return (f s, ())
    
    instance Monad m => IMonad (IStateT m) where
        ireturn x = IStateT (\s -> return (s, x))
        IStateT f >>>= g = IStateT $ \s -> do
                                        (s', x) <- f s
                                        let IStateT h = g x
                                        h s'
    instance IMonadTrans IStateT where
        ilift m = IStateT $ \s -> m >>= \x -> return (s, x)
    

    The idea is that monadic actions like askSize and askWeight (below) will add some data to the implicit environment, growing its type. So I'm going to build the implicit environment out of nested tuples, treating them as type-level lists of types. Nested tuples are more flexible (though less efficient) than flat tuples, because they allow you to abstract over the tail of the list. This allows you to build tuples of arbitrary size.

    type StateMachine = IStateT IO
    
    newtype Size = Size Int
    newtype Height = Height Int
    newtype Weight = Weight Int
    newtype Colour = Colour String
    
    -- askSize takes an environment of type as and adds a Size element
    askSize :: StateMachine as (Size, as) ()
    askSize = askNumber "What is your size?" Size
    
    -- askHeight takes an environment of type as and adds a Height element
    askHeight :: StateMachine as (Height, as) ()
    askHeight = askNumber "What is your height?" Height
    
    -- etc
    askWeight :: StateMachine as (Weight, as) ()
    askWeight = askNumber "What is your weight?" Weight
    
    askColour :: StateMachine as (Colour, as) ()
    askColour =
        -- poor man's do-notation. You could use RebindableSyntax
        ilift (putStrLn "What is your favourite colour?") >>>
        ilift readLn                                      >>>= \answer ->
        imodify (Colour answer,)
    
    calculateSize :: Height -> Weight -> Size
    calculateSize (Height h) (Weight w) = Size (h - w)  -- or whatever the calculation is
    
    askNumber :: String -> (Int -> a) -> StateMachine as (a, as) ()
    askNumber question mk =
        ilift (putStrLn question) >>>
        ilift readLn              >>>= \answer ->
        case reads answer of
            [(x, _)] -> imodify (mk x,)
            _ -> ilift (putStrLn "Please type a number") >>> askNumber question mk
    
    askYN :: String -> StateMachine as as Bool
    askYN question =
        ilift (putStrLn question) >>>
        ilift readLn              >>>= \answer ->
        case answer of
            "y" -> ireturn True
            "n" -> ireturn False
            _ -> ilift (putStrLn "Please type y or n") >>> askYN question
    

    My implementation differs from your specification slightly. You say it should be impossible to ask for the user's size and then ask for their weight. I say it should be possible - the result just won't necessarily have the type you wanted (because you've added two things to the environment, not one). This comes in useful here, where askOrCalculateSize is just a black box which adds a Size (and nothing else) to an environment. Sometimes it does it by asking for the size directly; sometimes it calculates it by first asking for the height and the weight. It doesn't matter as far as the type-checker is concerned.

    interaction :: StateMachine xs (Colour, (Size, xs)) ()
    interaction =
        askYN "Do you know your size?" >>>= \answer ->
        askOrCalculateSize answer >>>
        askColour
    
        where askOrCalculateSize True = askSize
              askOrCalculateSize False =
                askWeight >>>
                askHeight >>>
                imodify (\(h, (w, xs)) -> ((calculateSize h w), xs))
    

    There's a question remaining: how can one resume a computation from a persisted state? You don't statically know the type of the input environment (though it's safe to assume the output is always (Colour, Size)) because it varies throughout the computation, and you don't know until you load the persisted state where it was up to.

    The trick is to use a bit of GADT evidence which you can pattern-match on to learn what the type was. Stage represents where you could've got up to in the process, and it's indexed by the type that the environment should have by that stage. Suspended pairs up a Stage with the actual data that was in the environment at the point that the computation was suspended.

    data Stage as where
        AskSize :: Stage as
        AskWeight :: Stage as
        AskHeight :: Stage (Weight, as)
        AskColour :: Stage (Size, as)
    
    data Suspended where
        Suspended :: Stage as -> as -> Suspended
    
    resume :: Suspended -> StateMachine as (Colour, (Size, as)) ()
    resume (Suspended AskSize e) =
        iput e                                               >>>
        askSize                                              >>>
        askColour
    resume (Suspended AskWeight e) =
        iput e                                               >>>
        askWeight                                            >>>
        askHeight                                            >>>
        imodify (\(h, (w, xs)) -> ((calculateSize h w), xs)) >>>
        askColour
    resume (Suspended AskHeight e) =
        iput e                                               >>>
        askHeight                                            >>>
        imodify (\(h, (w, xs)) -> ((calculateSize h w), xs)) >>>
        askColour
    resume (Suspended AskColour e) =
        iput e                                               >>>
        askColour
    

    Now you can add suspension points to the computation:

    -- given persist :: Suspended -> IO ()
    suspend :: Stage as -> StateMachine as as ()
    suspend stage =
        iget                                  >>>= \env
        ilift (persist (Suspended stage env))
    

    resume works, but it's pretty ugly and has a lot of code duplication. This is because once you've put a state monad together you can't take it apart again to look inside it. You can't jump in at a given point in the computation. This is the big advantage of your original design, wherein you represented the state machine as a data structure which can be queried in order to figure out how to resume the computation. This is called an initial encoding, whereas my example (representing the state machine as a function) is a final encoding. Final encodings are simple but initial encodings are flexible. Hopefully you can see how to adapt an initial approach to the indexed monad design.