Search code examples
haskellmonadsstate-monaddo-notation

Generating a unique value in Haskell do-notation


To generate x86 assembly code, I have defined a custom type called X86:

data X86 a = X86 { code :: String, counter :: Integer, value :: (X86 a -> a) }

This type is used in do-notation like the following. This makes it easy to write templates for generating if-statements, for-loops, etc...

generateCode :: X86 ()
generateCode = do
  label1 <- allocateUniqueLabel
  label2 <- allocateUniqueLabel
  jmp label1
  label label1
  jmp label2
  label label2

Instructions are defined like this:

jmp :: String -> X86 ()
jmp l = X86 { code = "jmp " ++ l ++ ";\n", counter = 0, value = const () }

label :: String -> X86 ()
label l = X86 { code = l ++ ":\n", counter = 0, value = const () }

And the completed assembly file is printed like so:

printAsm :: X86 a -> String
printAsm X86{code=code} = code

main = do
  putStrLn (printAsm generateCode)

I implemented the X86 monad in the following manner. Essentially, the sequence operator concatenates blocks of assembly code in order and ensures the counters are incremented.

instance Monad X86 where
  x >> y = X86 { code = code x ++ code y, counter = counter x + counter y, value = value y }
  x >>= f = x >> y
    where y = f (value x x)

The problem is the labels are not incremented properly, so they are not unique! The following is the output:

jmp Label1;
Label1:
jmp Label1;
Label1:

I desire the output to have a unique value for each label:

jmp Label1;
Label1:
jmp Label2;
Label2:

To complete the example, here is the implementation of the allocatedUniqueLabel function:

allocateUniqueId :: X86 Integer
allocateUniqueId = X86 { code = "", counter = 1, value = counter }

allocateUniqueLabel :: X86 String
allocateUniqueLabel = do
  id <- allocateUniqueId
  return ("Label" ++ show id)

How can I fix my X86 monad so the labels are unique?

Here is what I've tried:

  • Incrementing a global counter. => Haskell does not safely allow global state outside the IO monad.
  • Using the State monad. => I have looked into a number of examples, but do not understand how to integrate them into my existing X86 monad.
  • Keep track of the counter outside of the monad. => I rather the counter is updated "behind the scenes"; otherwise, a lot of code templates which do not use labels will need to propagate the counter manually.

Solution

  • We can use mtl classes to describe X86 code as effectful programs. We want:

    • to generate code, this is a Writer effect;
    • to maintain a counter, this is a State effect.

    We worry about instantiating these effects last, and in the description of the programs we use MonadWriter and MonadState constraints.

    import Control.Monad.State  -- mtl
    import Control.Monad.Writer
    

    Allocating a new identifier increments the counter, without generating any code. This only uses the State effect.

    type Id = Integer
    
    allocateUniqueLabel :: MonadState Id m => m String
    allocateUniqueLabel = do
      i <- get
      put (i+1)  -- increment
      return ("Label" ++ show (i+1))
    

    And of course, we have actions to generate code, that don't need to care about the current state. So they use the Writer effect.

    jmp :: MonadWriter String m => String -> m ()
    jmp l = tell ("jmp " ++ l ++ ";\n")
    
    label :: MonadWriter String m => String -> m ()
    label l = tell (l ++ ":\n")
    

    The actual program looks the same as the original, but with more general types.

    generateCode :: (MonadState Id m, MonadWriter String m) => m ()
    generateCode = do
      label1 <- allocateUniqueLabel
      label2 <- allocateUniqueLabel
      jmp label1
      label label1
      jmp label2
      label label2
    

    The effects are instantiated when we run this program, here using runWriterT/runWriter and runStateT/runState (the order doesn't matter much, these two effects commute).

    type X86 = WriterT String (State Id)
    
    runX86 :: X86 () -> String
    runX86 gen = evalState (execWriterT gen) 1 -- start counting from 1
    -- evalState and execWriterT are wrappers around `runStateT` and `runWriterT`:
    -- - execWriterT: discards the result (of type ()), only keeping the generated code.
    -- - evalState: discards the final state, only keeping the generated code,
    --   and does some unwrapping after there are no effects to handle.