Search code examples
haskellabstract-syntax-treerecursive-datastructuresfixpoint-combinators

Haskell AST Annotation with Fix


I am working on creating an AST in Haskell. I want to add different annotations, such as types and location information, so I ended up using fixplate. However, I can't find any examples online and am having some difficulty.

I've set up my AST as recommended by fixplate (some striped out):

data ProgramF a
  = Unary a
          Operator
  | Number Int
  | Let { bindings :: [(Identifier, a)]
        , body :: a }

type Program = Mu ProgramF

Next to add a label I created another type, and a function to add labels based on a tree traversal.

type LabelProgram = Attr ProgramF PLabel

labelProgram :: Program -> LabelProgram
labelProgram =
  annMap (PLabel . show . fst) . (snd . synthAccumL (\i x -> (i + 1, (i, x))) 0)

However, beyond this I am running into some issues. For example, I am trying to write a function that does some transformation on the AST. Because it requires a label to function, I've made the type LabelProgram -> Program, but I think I am doing something wrong here. Below is a snippet of part of the function (one of the simpler parts):

toANF :: LabelProgram -> Program
toANF (Fix (Ann label (Let {bindings, body}))) = Fix $ Let bindingANF nbody
  where
    bindingANF = map (\(i, e) -> (i, toANF e)) bindings
    nbody = toANF body

I feel like I am working at the wrong level of abstraction here. Should I be explicitly matching against Fix Ann ... and returning Fix ... like this, or am I utilizing fixplate wrong?

Additionally, I am concerned about how to generalize functions. How can I make my functions work for Programs, LabelPrograms, and TypePrograms generically?


Solution

  • Edit: Add an example of a function for ProgramFs with generic annotations.

    Yes, at least in the case of toANF, you're using it wrong.

    In toANF, note that your Let bindingANF nbody and the companion definitions of bindingANF and nbody are just a reimplementation of fmap toANF for the specific constructor Let.

    That is, if you derive a Functor instance for your ProgramF, then you can re-write your toANF snippet as:

    toANF :: LabelProgram -> Program
    toANF (Fix (Ann label l@(Let _ _))) = Fix (fmap toANF l)
    

    If toANF is just stripping labels, then this definition works for all constructors and not just Let, so you can drop the pattern:

    toANF :: LabelProgram -> Program
    toANF (Fix (Ann label l)) = Fix (fmap toANF l)
    

    and now, as per @Regis_Kuckaertz's comment, you've just re-implemented forget which is defined as:

    forget = Fix . fmap forget . unAnn . unFix
    

    With respect to writing functions that work generically on Program, LabelProgram, etc., I think it makes more sense to write functions generic in a (single) annotation:

    foo :: Attr ProgramF a -> Attr ProgramF a
    

    and, if you really need to apply them to the unannotated program, define:

    type ProgramU = Attr ProgramF ()
    

    where the "U" in ProgramU stands for "unit". Obviously, you can easily write translators to work with Programs as ProgramUs if really needed:

    toU :: Functor f => Mu f -> Attr f ()
    toU = synthetise (const ())
    
    fromU :: Functor f => Attr f () -> Mu f
    fromU = forget
    
    mapU :: (Functor f) => (Attr f () -> Attr f ()) -> Mu f -> Mu f
    mapU f = fromU . f . toU
    
    foo' :: Mu ProgramF -> Mu ProgramF
    foo' = mapU foo
    

    As a concrete -- if stupid -- example, here's a function that separates Lets with multiple bindings into nested Lets with singleton bindings (and so breaks mutually recursive bindings in the Program language). It assumes that the annotation on a multi-binding Let will be copied to each of the resulting singleton Lets:

    splitBindings :: Attr ProgramF a -> Attr ProgramF a
    splitBindings (Fix (Ann a (Let (x:y:xs) e)))
      = Fix (Ann a (Let [x] (splitBindings (Fix (Ann a (Let (y:xs) e))))))
    splitBindings (Fix e) = Fix (fmap splitBindings e)
    

    It can be applied to an example Program:

    testprog :: Program
    testprog = Fix $ Unary (Fix $ Let [(Identifier "x", Fix $ Number 1), 
                                       (Identifier "y", Fix $ Number 2)] 
                                      (Fix $ Unary (Fix $ Number 3) NegOp))
                           NegOp
    

    like so:

    > mapU splitBindings testprog
    Fix (Unary (Fix (Let {bindings = [(Identifier "x",Fix (Number 1))],
    body = Fix (Let {bindings = [(Identifier "y",Fix (Number 2))], 
    body = Fix (Unary (Fix (Number 3)) NegOp)})})) NegOp)
    >
    

    Here's my full working example:

    {-# LANGUAGE DeriveFunctor #-}
    {-# OPTIONS_GHC -Wall #-}
    
    import Data.Generics.Fixplate
    
    data Identifier = Identifier String deriving (Show)
    data PLabel = PLabel deriving (Show)
    data Operator = NegOp deriving (Show)
    
    data ProgramF a
      = Unary a
              Operator
      | Number Int
      | Let { bindings :: [(Identifier, a)]
            , body :: a }
      deriving (Show, Functor)
    instance ShowF ProgramF where showsPrecF = showsPrec
    
    type Program = Mu ProgramF
    type LabelProgram = Attr ProgramF PLabel
    
    splitBindings :: Attr ProgramF a -> Attr ProgramF a
    splitBindings (Fix (Ann a (Let (x:y:xs) e)))
      = Fix (Ann a (Let [x] (splitBindings (Fix (Ann a (Let (y:xs) e))))))
    splitBindings (Fix e) = Fix (fmap splitBindings e)
    
    toU :: Functor f => Mu f -> Attr f ()
    toU = synthetise (const ())
    
    fromU :: Functor f => Attr f () -> Mu f
    fromU = forget
    
    mapU :: (Functor f) => (Attr f () -> Attr f ()) -> Mu f -> Mu f
    mapU f = fromU . f . toU
    
    testprog :: Program
    testprog = Fix $ Unary (Fix $ Let [(Identifier "x", Fix $ Number 1), 
                                       (Identifier "y", Fix $ Number 2)] 
                                      (Fix $ Unary (Fix $ Number 3) NegOp))
                           NegOp
    
    main :: IO ()
    main = print $ mapU splitBindings testprog